DataCleaner is a data quality analysis tool that allows you to perform data profiling, validating, and minor ETL-like tasks. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation. It can be used for master data management (MDM) methodologies, data warehousing projects, statistical research, preparation for extract-transform-load activities, and more.
|Tags||Office/Business Scientific/Engineering Information Management Metadata/Semantic Models Records Management Database Data Warehousing Business Intelligence Data Profiling|
Who will post the best content for use in DataCleaner?
Human Inference is announcing a competition for the DataCleaner community. The goal is to...
Release Notes: The 'Synonym lookup' transformation now has an option to look up every token of the input. This is useful if you're doing replacement of synonyms within the values of a long text field. A potential failure was fixed when blocking execution of DataCleaner jobs through the monitor's Web service. An improvement was made in the way jobs and the sequence of components are closed / cleaned up after execution. The Java WebStart version of DataCleaner was exposed by a bug in the Java runtime causing certain JAR files not to be recognized by the WebStart launcher under certain circumstances.
Release Notes: It is now possible to hide output columns of transformations. Hiding will not affect the processing flow, but simply hide them from the user interface, potentially making the experience cleaner when interacting with other components. A new Web service has been added to the monitoring Web application which provides a way to poll the status of the execution of a particular job. A bug has been fixed which caused the HTML report to fail for certain analysis types when no records had been processed. Six other minor bugs have been addressed.
Release Notes: This release adds a new filter for performing Change Data Capture, makes execution of jobs queued to avoid concurrent execution issues, and adds several minor bugfixes and improvements.
Release Notes: A major milestone for the data quality monitoring Web application: the addition of connectivity to Salesforce and SugarCRM. Addition of wizards and other user experience improvements. Enables clustered execution of jobs. New data visualization extension and a national identifier validation extension. Adds Pentaho Data Integration job scheduling and execution.
Release Notes: A Web service was added to the monitoring application for getting a (list of) metric values. The 'Table lookup' component has been improved by adding join semantics as a configurable property. The EasyDQ components have been upgraded, adding further configuration options and a richer deduplication result interface. Performance improvements have been a specific focus of this release. Improvements have been made in the engine of DataCleaner to further utilize a streaming processing approach in certain corner cases which was not covered previously.