Welcome to OutlierDenStream’s documentation!¶
OutlierDenStream¶
DEPRECATED!¶
Please use https://github.com/anrputina/oadds
Installation¶
Stable release¶
Release on pip index coming soon. Please use the sources
From sources¶
The sources for OutlierDenStream can be downloaded from the Github repo.
You can either clone the public repository:
$ git clone git://github.com/anrputina/outlierdenstream
Or download the tarball:
$ curl -OL https://github.com/anrputina/outlierdenstream/tarball/master
Once you have a copy of the source, you can install it with:
$ python setup.py install
Usage¶
To use OutlierDenStream in a project:
from outlierdenstream import Sample, OutlierDenStream
Initialize OutlierDenStream object:
ods = OutlierDenStream(lamb=0.03, epsilon='auto', beta=0.03, mu='auto', startingBuffer=bufferDf, tp=12)
ods.runInitialization()
Fit each sample of the dataset with:
for row in dataset:
sample = Sample(row, timestamp)
result = ods.runOnNewSample(sample)
The algorithm returns True
(outlier) if it is not able to merge the new sample to an existing core-micro-cluster or merges the sample to an existing outlier-micro-cluster. Returns False
(normal) otherwise.
Documentation¶
Sample¶
-
class
outlierdenstream.
Sample
(value, timestamp: int)[source]¶ Each record of the stream has to be declared as a Sample class.
Parameters: - value – the values of the current sample.
- timestamp – the timestamp of current sample.
Micro-Cluster¶
-
class
outlierdenstream.
MicroCluster
(currenttimestamp, lamb, clusterNumber)[source]¶ Micro-Cluster class
Parameters: - currenttimestamp – the timestamp in which the cluster is created.
- lamb – the lamb parameter used as decay factor.
- clusterNumber – the number of the micro-cluster.
-
insertSample
(sample, timestamp=0)[source]¶ Adds a sample to a micro-cluster. Updates the variables of the micro-cluster with
updateRealTimeWeight()
andupdateRealTimeLSandSS()
Parameters: - sample – the sample object
- timestamp – deprecated, not needed anymore. Will be removed in the next versions.
-
noNewSamples
()[source]¶ Updates the Weighted Linear Sum (WLS), the Weighted Squared Sum (WSS) and the weight of the micro-cluster when no new samples are merged.
Cluster¶
OutlierDenStream¶
-
class
outlierdenstream.
OutlierDenStream
(lamb, epsilon=1, minPts=1, beta=1, mu=1, numberInitialSamples=None, startingBuffer=None, tp=60, radiusFactor=1)[source]¶ OutlierDenStream class.
Parameters: - lamb – the lambda parameter - fading factor
- epsilon – the epsilon parameter
- beta – the beta parameter
- mu – the mu parameter
- numberInitialSamples – samples to use as initial buffer
- startgingBuffer – initial buffer on which apply DBScan or use it as unique class.
- tp – frequency at which to apply the pruning strategy and remove old micro-clusters.
-
initWithoutDBScan
()[source]¶ Produces a micro-cluster merging all the samples passed into the initial buffer
If epsilon is auto computes epsilon as the maxium radius obtained from these initial samples.
-
resetLearningImpl
()[source]¶ Initializes two empty Cluster as a p-micro-cluter list and o-micro-cluster list.
If mu is auto computes the value
-
runDBSCanInitialization
()[source]¶ Initializes the variables of the main algorithm with the methods
resetLearningImpl()
andinitDBScan()
-
runInitialization
()[source]¶ Initializes the variables of the main algorithm with the methods
resetLearningImpl()
andinitWithoutDBScan()
-
runOnNewSample
(sample)[source]¶ Performs the basic DenStream procedure for merging new samples.
- Try to merge the sample to the closest core-micro-cluster (or)
- Try to merge the sample to the closest outlier-micro-cluster (or)
- Generate new outlier-micro-cluster by the sample
Parameters: sample – the new available sample in the stream Returns: False
if the sample is merged to an existing core-micro-cluster otherwiseTrue
meaning “anomalous” sample.
Contributing¶
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
You can contribute in many ways:
Types of Contributions¶
Report Bugs¶
Report bugs at https://github.com/anrputina/outlierdenstream/issues.
If you are reporting a bug, please include:
- Your operating system name and version.
- Any details about your local setup that might be helpful in troubleshooting.
- Detailed steps to reproduce the bug.
Fix Bugs¶
Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.
Implement Features¶
Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.
Write Documentation¶
OutlierDenStream could always use more documentation, whether as part of the official OutlierDenStream docs, in docstrings, or even on the web in blog posts, articles, and such.
Submit Feedback¶
The best way to send feedback is to file an issue at https://github.com/anrputina/outlierdenstream/issues.
If you are proposing a feature:
- Explain in detail how it would work.
- Keep the scope as narrow as possible, to make it easier to implement.
- Remember that this is a volunteer-driven project, and that contributions are welcome :)
Get Started!¶
Ready to contribute? Here’s how to set up outlierdenstream for local development.
Fork the outlierdenstream repo on GitHub.
Clone your fork locally:
$ git clone git@github.com:your_name_here/outlierdenstream.git
Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:
$ mkvirtualenv outlierdenstream $ cd outlierdenstream/ $ python setup.py develop
Create a branch for local development:
$ git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:
$ flake8 outlierdenstream tests $ python setup.py test or py.test $ tox
To get flake8 and tox, just pip install them into your virtualenv.
Commit your changes and push your branch to GitHub:
$ git add . $ git commit -m "Your detailed description of your changes." $ git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Pull Request Guidelines¶
Before you submit a pull request, check that it meets these guidelines:
- The pull request should include tests.
- If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring, and add the feature to the list in README.rst.
- The pull request should work for Python 2.7, 3.4, 3.5 and 3.6, and for PyPy. Check https://travis-ci.org/anrputina/outlierdenstream/pull_requests and make sure that the tests pass for all supported Python versions.
Deploying¶
A reminder for the maintainers on how to deploy. Make sure all your changes are committed (including an entry in HISTORY.rst). Then run:
$ bumpversion patch # possible: major / minor / patch
$ git push
$ git push --tags
Travis will then deploy to PyPI if tests pass.
Credits¶
Development Lead¶
- Andrian Putina <anr.putina@gmail.com>
Contributors¶
None yet. Why not be the first?