PyDP¶

Algorithms¶

class pydp.algorithms.laplacian.BoundedMean(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

BoundedMean computes the average of values in a dataset, in a differentially private manner.

Incrementally provides a differentially private average. All input vales are normalized to be their difference from the middle of the input range. That allows us to calculate the sum of all input values with half the sensitivity it would otherwise take for better accuracy (as compared to doing noisy sum / noisy count). This algorithm is taken from section 2.5.5 of the following book (algorithm 2.4): https://books.google.com/books?id=WFttDQAAQBAJ&pg=PA24#v=onepage&q&f=false

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.BoundedSum(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

BoundedSum computes the sum of values in a dataset, in a differentially private manner.

Incrementally provides a differentially private sum, clamped between upper and lower values. Bounds can be manually set or privately inferred.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.BoundedStandardDeviation(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

BoundedStandardDeviation computes the standard deviation of values in a dataset, in a differentially private manner.

Incrementally provides a differentially private standard deviation for values in the range [lower..upper]. Values outside of this range will be clamped so they lie in the range. The output will also be clamped between 0 and (upper - lower).

The implementation simply computes the bounded variance and takes the square root, which is differentially private by the post-processing theorem. It relies on the fact that the bounded variance algorithm guarantees that the output is non-negative.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.BoundedVariance(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

BoundedVariance computes the variance of values in a dataset, in a differentially private manner.

Incrementally provides a differentially private variance for values in the range [lower..upper]. Values outside of this range will be clamped so they lie in the range. The output will also be clamped between 0 and (upper - lower)^2. Since the result is guaranteed to be positive, this algorithm can be used to compute a differentially private standard deviation.

The algorithm uses O(1) memory and runs in O(n) time where n is the size of the dataset, making it a fast and efficient. The amount of noise added grows quadratically in (upper - lower) and decreases linearly in n, so it might not produce good results unless n >> (upper - lower)^2.

The algorithm is a variation of the algorithm for differentially private mean from “Differential Privacy: From Theory to Practice”, section 2.5.5: https://books.google.com/books?id=WFttDQAAQBAJ&pg=PA24#v=onepage&q&f=false

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.Max(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

Max computes the Max value in the dataset, in a differentially private manner.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.Min(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

Min computes the minium value in the dataset, in a differentially private manner.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.Median(epsilon: float = 1.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

Median computes the Median value in the dataset, in a differentially private manner.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.Count(epsilon: float = 1.0, l0_sensitivity: int = 1, linf_sensitivity: int = 1, dtype: str = 'int')¶

Count computes the Count of number of items in the dataset, in a differentially private manner.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

class pydp.algorithms.laplacian.Percentile(epsilon: float = 1.0, percentile: float = 0.0, lower_bound: Optional[Union[int, float]] = None, upper_bound: Optional[Union[int, float]] = None, dtype: str = 'int')¶

Perencetile finds the value in the dataset with that percentile, in a differentially private manner.

add_entries(data: List[Union[int, float]]) → None ¶

Adds multiple inputs to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current list passed is not added.

add_entry(value: Union[int, float]) → None ¶

Adds one input to the algorithm.

Note: If the data exceeds the overflow limit of storage, the current data passed is not added.

property epsilon¶: Returns the epsilon set at initialization.

property l0_sensitivity¶: Returns the l0_sensitivity set at initialization.

property linf_sensitivity¶: Returns the linf_sensitivity set at initialization.

memory_used() → float ¶: Returns the memory currently used by the algorithm in bytes.

merge(summary)¶: Merges serialized summary data into this algorithm. The summary proto must represent data from the same algorithm type with identical parameters. The data field must contain the algorithm summary type of the corresponding algorithm used. The summary proto cannot be empty.

noise_confidence_interval(confidence_level: float, privacy_budget: float) → float ¶

Returns the confidence_level confidence interval of noise added within the algorithm with specified privacy budget, using epsilon and other relevant, algorithm-specific parameters (e.g. bounds) provided by the constructor.

This metric may be used to gauge the error rate introduced by the noise.

If the returned value is <x,y>, then the noise added has a confidence_level chance of being in the domain [x,y].

By default, NoiseConfidenceInterval() returns an error. Algorithms for which a confidence interval can feasibly be calculated override this and output the relevant value.

Conservatively, we do not release the error rate for algorithms whose confidence intervals rely on input size.

property percentile¶: percentile Gets the value that was set in the constructor.

privacy_budget_left() → float ¶: Returns the remaining privacy budget.

quick_result(data: List[Union[int, float]]) → Union[int, float]¶

Runs the algorithm on the input using the epsilon parameter provided in the constructor and returns output.

Consumes 100% of the privacy budget.

Note: It resets the privacy budget first.

reset() → None ¶: Resets the algorithm to a state in which it has received no input. After Reset is called, the algorithm should only consider input added after the last Reset call when providing output.

result(privacy_budget: Optional[float] = None, noise_interval_level: Optional[float] = None) → Union[int, float]¶

Gets the algorithm result.

The default call consumes the remaining privacy budget.

When privacy_budget (defined on [0,1]) is set, it consumes only the privacy_budget amount of budget.

noise_interval_level provides the confidence level of the noise confidence interval, which may be included in the algorithm output.

serialize()¶

Serializes summary data of current entries into Summary proto. This allows results from distributed aggregation to be recorded and later merged.

Returns empty summary for algorithms for which serialize is unimplemented.

Distributions¶

class pydp.distributions.GaussianDistribution¶

sample(self: pydp.GaussianDistribution, scale: float = 1.0) → float ¶

Samples the Gaussian with distribution Gauss(scale*stddev).

scale: A factor to scale stddev.

property stddev¶: Returns stddev

class pydp.distributions.LaplaceDistribution¶

Draws samples from the Laplacian distribution.

get_diversity(self: pydp.LaplaceDistribution) → float ¶: Returns the parameter defining this distribution, often labeled b.

get_uniform_double(self: pydp.LaplaceDistribution) → float ¶: Returns a uniform random integer of in range [0, 2^53).

sample(self: pydp.LaplaceDistribution, scale: float = 1.0) → float ¶

Samples the Laplacian distribution Laplace(u, scale*b).

Parameters: scale – A factor to scale b.

Util¶

pydp.util.Geometric() → int ¶

pydp.util.UniformDouble() → float ¶

pydp.util.correlation(arg0: List[float], arg1: List[float]) → float ¶

pydp.util.get_next_power_of_two(arg0: float) → float ¶

pydp.util.mean(*args, **kwargs)¶

Overloaded function.

mean(arg0: List[float]) -> float
mean(arg0: List[int]) -> float

pydp.util.order_statistics(arg0: float, arg1: List[float]) → float ¶

pydp.util.qnorm(arg0: float, arg1: float, arg2: float) → pydp._pydp.StatusOrD¶

pydp.util.round_to_nearest_multiple(arg0: float, arg1: float) → float ¶

pydp.util.safe_add(arg0: int, arg1: int) → int ¶

pydp.util.safe_square(arg0: int) → int ¶

pydp.util.safe_subtract(arg0: int, arg1: int) → int ¶

pydp.util.standard_deviation(arg0: List[float]) → float ¶

pydp.util.variance(arg0: List[float]) → float ¶

pydp.util.vector_filter(arg0: List[float], arg1: List[bool]) → List[float]¶

pydp.util.vector_to_string(arg0: List[float]) → str ¶

pydp.util.xor_strings(arg0: str, arg1: str) → str ¶