Prepare data easily and efficiently and allow enterprises to access required data at any time
Entity Enhanced Rules (REEs) for Data Quality Build a self-enhancing system that automatically solves data quality issues by unifying logical rules and machine learning models
Provide automated intelligent solutions, comprehensive and in-depth diagnosis to fix data problems
Machine learning or logic deduction?
ML has strong ability of expression, and rule is better for reasoning
No existing systems unify them efficiently and effectively
Performance in big data?
How to handle big data efficiently and achieve the parallel scalability
Functionality of data quality?
Now only conflict resolution and entity resolution are mainly considered in data quality
Are there any aspects of functionality in data quality?
Unifying logical rules and machine learning
Fully automatic parallel rule discovery
Error correction with correctness guarantee
Parallel error detection and incremental detection
Incremental computation in response to updates
Handle data by drag and drop operations without handcraft coding, so that users can get started quickly and easily
Provide clear and hierarchical visualization to display data relationships and business logic
Embed machine/deep learning models as predicates in logic rules, to facilitate understanding and enable interpretability
Offer a full range of functionalities, including data accessing, rule discovery, error detection and error correction in an end-to-end data quality solution
Provide clear visualization for quick and easy data analysis
Automatic execution of data cleaning tasks without handcraft coding
Intelligent data profiling and error correction with a rigorous scoring mechanism to ensure data processing in a logical and systematic manner
For data sources that require quality analysis and processing, we enable unified data access and centralized management, and support various queries via data labels.
By dragging and dropping, users can efficiently create visible and interactive reports, allowing them to have an intuitive understanding of data distribution and relationships.
Support the access and import of data from traditional databases, unstructured data, and semi-structured data, and offer a full range of functionalities to access various data sources and integrate multi-source data.
Rock provides comprehensive and visible data profiling capabilities, allowing users to fully understand their data from different perspectives and dimensions.
By combining existing data quality rules and data standards, we conduct comprehensive examination of user data, summarizing information such as null values, data types, and value distributions.
We provide profiling reports with clear visualization, allowing users to have a complete understating of their data. Users can view details of data conflicts and efficiently explore the data status in the data tables.
Support automatic rule discovery for users to quickly understand their data and the underlying patterns, and provide detailed evaluation reports for business users.
Discover potential data quality rules automatically without handcraft coding, and visualize the rules for better understanding.
By combining logic rules with AI, we offer both interpretability and in-depth analysis, and outperform existing inference engines in terms of rule types supported.
Rock supports efficient parallel algorithms for error detection and error correction.
Generate heat maps based on the conflict levels and help users identify critical data.
By iteratively executing rules, conflicts and errors can be identified, improving data consistency, integrity, and accuracy.
Provide parallel algorithms for error detection and error correction, generate real-time operation logs, support undo actions with a high fault tolerance rate.
By leveraging the interaction of ER, CR, MI, and TD, data can be automatically repaired, improving the overall data quality.
By leveraging external knowledge, Rock infers the characteristic of data and conduct automatic standardization based on the characteristic inferred.
In the financial domain, customer identities, such as name, gender, nationality, occupation and address, can be standardized. Moreover, in the e-commerce domain, product descriptions, such as brand, category, color, title, can also be standardized.
Rock incorporates industry-standard specification and is able to automatically transform non-standard terms into standard ones via semantic parsing.
Rock gives an end-to-end solution for ER, by efficiently identifying records of the same entity when data is integrated from different sources.
Provide an efficient integration algorithm for data from different sources and identify the best records.
Provide an efficient parallel entity blocking method to quickly and accurately identify redundant entities in massive multi-source data.
Identify redundant records from multi-source data in a zero-shot or few-shot learning manner and incorporate transfer learning techniques to achieve effective entity matching across different domains.
Incorporate large language models into entity resolution to further enhance the performance by utilizing techniques such as prompt engineering and knowledge distillation.
By dragging and dropping, users can efficiently create visible and interactive reports, allowing them to have an intuitive understanding of data distribution and relationships.
Support the access and import of data from traditional databases, unstructured data, and semi-structured data, and offer a full range of functionalities to access various data sources and integrate multi-source data.
By combining existing data quality rules and data standards, we conduct comprehensive examination of user data, summarizing information such as null values, data types, and value distributions.
We provide profiling reports with clear visualization, allowing users to have a complete understating of their data. Users can view details of data conflicts and efficiently explore the data status in the data tables.
Discover potential data quality rules automatically without handcraft coding, and visualize the rules for better understanding.
By combining logic rules with AI, we offer both interpretability and in-depth analysis, and outperform existing inference engines in terms of rule types supported.
Generate heat maps based on the conflict levels and help users identify critical data.
By iteratively executing rules, conflicts and errors can be identified, improving data consistency, integrity, and accuracy.
Provide parallel algorithms for error detection and error correction, generate real-time operation logs, support undo actions with a high fault tolerance rate.
By leveraging the interaction of ER, CR, MI, and TD, data can be automatically repaired, improving the overall data quality.
In the financial domain, customer identities, such as name, gender, nationality, occupation and address, can be standardized. Moreover, in the e-commerce domain, product descriptions, such as brand, category, color, title, can also be standardized.
Rock incorporates industry-standard specification and is able to automatically transform non-standard terms into standard ones via semantic parsing.
Provide an efficient integration algorithm for data from different sources and identify the best records.
Provide an efficient parallel entity blocking method to quickly and accurately identify redundant entities in massive multi-source data.
Identify redundant records from multi-source data in a zero-shot or few-shot learning manner and incorporate transfer learning techniques to achieve effective entity matching across different domains.
Incorporate large language models into entity resolution to further enhance the performance by utilizing techniques such as prompt engineering and knowledge distillation.