Modules
Find the module that fits your needs!
Last updated
Find the module that fits your needs!
Last updated
Extract
Loading data from any source to the preferred format
Transform
bias
(WIP) Reduce skewed or prejudiced data, particularly data that reinforce stereotypes.
Remove irrelevant, redundant, or noisy information, such as stop words or special characters.
decontamination
(WIP) Remove contaminated data including benchmark.
Remove duplicated data, targeting not only identical matches but also similar data.
PII stands for Personally Identifiable Information. Removing sensitive information from data.
Improving the data quality, in the perspective of accuracy, consistency, and reliability of data.
toxicity
(WIP) Removing harmful, offensive, or inappropriate content within the data.
Load
Saving the processed data to a preferred source like data lake, database, etc.
Utils
Essential tools for data processing, including sampling, logging, statistics, etc.
Pipeline
Executing the given ETL configuration.
Register
Register custom functions to be used in Dataverse.