Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
are, make_array, Table, )
'table must have 2 columns: a Values column and a Probability column'
""" Helper function that bins a distribution for plotting
Parameters ---------- dist : Table Distribution that needs to be binned. width (optional) : float Width of each bin. (Default: 1) start (optional) : float Where to start first bin. (Default: minimum value of the domain.) num_bins (optional): int Number of bins. (Default: minimum number of bins to span domain)
Returns ------- (new_domain, new_prob) Domain values of the new bins and the associated probabilities
Examples -------- >>> x = Table().values([0, 0.5, 1]).probability([1 / 3, 1 / 3, 1 / 3]) >>> _bin(x) (array([ 0., 1.]), array([ 0.66666667, 0.33333333])) >>> _bin(x, width=0.5) (array([ 0. , 0.5, 1. ]), array([ 0.33333333, 0.33333333, 0.33333333])) """
show_sd=False, **vargs): """ Plots the histogram for a single distribution.
Parameters ---------- dist : Table A 2-column table representing a probability distribution. width (optional) : float Width of the intervals (default: 1) edges (optional) : boolean If True, there will be a small border around the bars. If False, there will be no border. (default: small border unless there more than 75 bins) show_ev (optional) : boolean Adds a tick mark at the expected value. (default : False) show_ave (optional) : boolean Adds a tick mark at the average of an empirical distribution. (default : False) show_sd (optional) : boolean Adds two tick marks one sd above and one sd below the expected value. (default : False) vargs See pyplot's additional optional arguments. """ # Basic sanity checks.
# Default plot attributes.
# Set edges.
# Events.
# If event is a list of lists. len(event) + 1)) # Cycle through each event and remove from the events set. color=colors[i], **options)
else: # If event is just a list.
else: # No event.
# The minimum distance between any two values. for i in range(1, dist.num_rows)])
max(dist.column(0)) + min_distance + width / 2))
# Markers. verticalalignment='top', size=30, color='red') horizontalalignment='center', verticalalignment='top', size=30, color='blue') horizontalalignment='center', verticalalignment='top', size=30, color='blue')
""" Overlays histograms for multiple probability distributions together.
Parameters ---------- labels_and_dists : Even number of alternations between Strings and Tables Each distribution must have a label associated with it. width (optional) : float Width of the intervals. (default: 1) edges : bool If True, there will be a small border around the bars. If False, there will be no border. (default: small border unless there more than 75 bins) vargs See pyplot's documentation
Examples -------- >>> dist1 = Table().values([1, 2, 3, 4]).probability([1/4, 1/4, 1/4, 1/4]) >>> dist2 = Table().values([3, 4, 5, 6]).probability([1/2, 1/8, 1/8, 1/4]) >>> Plots('Distribution1', dist1, 'Distribution2', dist2) <histogram with dist1 and dist2> """
# Find the union of the domain of all plotted distributions. 'Argument {} must be a distribution'.format(i)
# Find the and bin the probabilities corresponding to each value in domain # for each distribution.
# Set the plot attributes. options['lw'] = 0
label=labels_and_dists[i * 2], **options)
for i in range(1, len(domain))]), 1)
max(domain) + min_distance + width / 2))
""" Assigns domain values to a single-variable distribution
Parameters ---------- values : List or Array Values to put into the domain
Returns ------- Table Table with those domain values in its first column
Examples --------
>>> Table().values([1, 2, 3]) Value 1 2 3 """
""" Assigns probabilities to a Distribution via a probability function. The probability function is applied to each value of the domain. Must have domain values in the first columns.
Parameters ---------- pfunc : func Probability function of the distribution.
Returns ------- Table Table with probabilities in its last column.
"""
""" Assigns probabilities to domain values.
Parameters ---------- values : List or Array Values that must correspond to the domain in the same order.
Returns ------- Table A probability distribution with those probabilities """
""" Assigns transition probabilities to a Distribution via a probability function. The probability function is applied to each value of the domain. Must have domain values in the first column first.
Parameters ---------- pfunc : variate function Conditional probability function of the distribution ( P(Y | X))
Returns ------- Table Table with those probabilities in its final column
"""
warnings.warn('Probability cannot be negative') all_other_vars, conditioned_var), values)
""" For a multivariate probability distribution, assigns transition probabilities, ie P(Y | X).
Parameters ---------- values : List or Array Values that must correspond to the domain in the same order
Returns ------- Table A probability distribution with those probabilities """ warnings.warn('Probability cannot be negative')
warnings.warn( 'Transition probabilities for {} sum to {:.04f} not 1'.format( row[0], row[-1]) )
""" Finds the probability of an event x.
Parameters ---------- x : float or Iterable An event represented either as a specific value in the domain or a subset of the domain
Returns ------- float Probability of the event
Examples --------
>>> dist = Table().values([1, 2, 3, 4]).probability([1/4, 1/4, 1/4, 1/4]) >>> dist.prob_event(2) 0.25 >>> dist.prob_event([2, 3]) 0.5 >>> dist.prob_event(np.arange(1, 5)) 1.0 """ else:
""" Shows the probability that distribution takes on value x or list of values x.
Parameters ---------- x : float or Iterable An event represented either as a specific value in the domain or a subset of the domain
Returns ------- Table Shows the probabilities of each value in the event
Examples -------- >>> dist = Table().values([1 2, 3, 4]).probability([1/4, 1/4, 1/4, 1/4]) >>> dist.event(2) Domain | Probability 2 | 0.25 >>> dist.event([2,3]) Domain | Probability 2 | 0.25 3 | 0.25 """
""" Returns the distribution by making the proabilities sum to 1
Returns ------- Table A distribution with the probabilities normalized
Examples -------- >>> Table().values([1, 2, 3]).probability([1, 1, 1]) Value | Probability 1 | 1 2 | 1 3 | 1 >>> Table().values([1, 2, 3]).probability([1, 1, 1]).normalized() Value | Probability 1 | 0.333333 2 | 0.333333 3 | 0.333333 """ column_label, self.column(column_label) / sum(self.column(column_label)) )
""" Randomly samples from the distribution.
Note that this function was previously named `sample` but was renamed due to naming conflicts with the datascience library.
Parameters ---------- n : int Number of times to sample from the distribution (default: 1)
Returns ------- float or array Samples from the distribution
Examples -------- >>> dist = Table().with_columns( ... 'Value', make_array(2, 3, 4), ... 'Probability', make_array(0.25, 0.5, 0.25)) >>> dist.sample_from_dist() 3 >>> dist.sample_from_dist() 2 >>> dist.sample_from_dist(10) array([3, 2, 2, 4, 3, 4, 3, 4, 3, 3]) """
""" Finds the cdf of the distribution
Parameters ---------- x : float Value in distribution
Returns ------- float Finds P(X<=x)
Examples -------- >>> dist = Table().with_columns( ... 'Value', make_array(2, 3, 4), ... 'Probability', make_array(0.25, 0.5, 0.25)) >>> dist.cdf(0) 0 >>> dist.cdf(2) 0.25 >>> dist.cdf(3.5) 0.75 >>> dist.cdf(1000) 1.0 """
""" Finds expected value of distribution
Returns ------- float Expected value
Examples -------- >>> dist = Table().values([1, 2, 4]).probability([0.5, 0.4, 0.1]) >>> dist.ev() 1.7 >>> 1 * 0.5 + 2 * 0.4 + 4 * 0.1 1.7 """
""" Finds variance of distribution
Returns ------- float Variance
Examples -------- >>> dist = Table().values([1, 2, 4]).probability([0.5, 0.4, 0.1]) >>> dist.var() 0.81 >>> (1 * 0.5 + 4 * 0.4 + 16 * 0.1) - (1.7) ** 2 0.81 """
""" Finds standard deviation of Distribution.
Returns ------- float Standard Deviation
Examples -------- >>> dist = Table().values([1, 2, 4]).probability([0.5, 0.4, 0.1]) >>> dist.sd() 0.9 """
""" Removes all values with zero probability from the Distribution.
Returns ------- Distribution
Examples -------- >>> dist = Table().values([2, 3, 4, 5]).probability([0.5, 0, 0.5, 0]) >>> dist Value | Probability 2 | 0.5 3 | 0 4 | 0.5 5 | 0 >>> dist.remove_zeros() Value | Probability 2 | 0.5 4 | 0.5 """
""" Takes an array of values and returns an empirical distribution
Parameters ---------- values : array Array of values that will be grouped by the distribution
Returns ------- Table A distribution
Examples -------- >>> x = make_array(1, 1, 1, 1, 1, 2, 3, 3, 3, 4) >>> emp_dist(x) Value | Proportion 1 | 0.5 2 | 0.1 3 | 0.3 4 | 0.1 """
'Proportion', position_counts.column(1) / total )
# Brighter colors than the default Table class rgb('darkblue'), rgb('gold'), (106 / 256, 166 / 256, 53 / 256), # vivid green (234 / 256, 77 / 256, 108 / 256), # rose garden rgb('brown'), (240 / 256, 127 / 256, 80 / 256), # vivid orange (53 / 256, 148 / 256, 216 / 256), # vivid blue (122 / 256, 55 / 256, 139 / 256), # purple rgb('black'), rgb('red'), ) |