Glossary entry

German term or phrase:

Datenbereinigung

English translation:

data cleaning

Added to glossary by Rebecca Holmes
Oct 8, 2002 06:48
21 yrs ago
4 viewers *
German term

Datenbereinigung

German to English Tech/Engineering computer database system
From a PPT slide listing the advantages of a customer database system:

Datenvalidierung und -bereinigung schon beim Import
Proposed translations (English)
4 +3 data cleaning
4 +2 data clean-up
5 data cleansing
5 Data validation and clean-up on import
3 +1 data filtering / cleansing
1 Oooops ...

Proposed translations

+3
9 mins
Selected

data cleaning

That's the standard term in English as well.
Peer comment(s):

neutral Klaus Dorn (X) : I remember it being called "cleansing"
5 mins
OK as well, of course, "standard term" referred to cleaning/cleansing as opposed to filtering
agree Gillian Scheibelein : data cleaning = data cleansing, matter of preference
6 mins
Clearly.
agree Chris Rowson (X) : In my experience (20 years IT), "data cleansing" is seldom heard and sounds odd, if not uninformed. "Data cleansing", however, is natural for me.
30 mins
agree Steffen Walter
4 hrs
Something went wrong...
4 KudoZ points awarded for this answer. Comment: "Thank you very much Klaus D., Endre, Chris, Joanne, Klaus B. and Martin. I have waited a couple days to pick an answer because the well-informed choices you provided made it very difficult to select just one. In the end the only fair way to call it seems to be number of Google hits: 13,800 for data cleaning, 11,500 for data cleansing, 10,900 for data filtering and 2,450 for data clean-up. It thus seems only fair to pick Endre's answer of "data cleaning." I really appreciate the amount of research you put into the question, however, Klaus D., and would like to extend my special thanks to all of you for your time and effort."
+1
3 mins

data filtering / cleansing

I favour "filtering" here, because it happens at import, while "cleansing" is something that is traditionally done afterwards...

--------------------------------------------------
Note added at 2002-10-08 06:53:39 (GMT)
--------------------------------------------------

\"data validation and filtering already at import\"

--------------------------------------------------
Note added at 2002-10-08 07:04:20 (GMT)
--------------------------------------------------

This project focuses on data cleansing, i.e., to detect and remove errors and inconsistencies in data from different sources to improve the data quality.

http://www.ics.uci.edu/~chenli/cleansing.html

--------------------------------------------------
Note added at 2002-10-08 07:05:01 (GMT)
--------------------------------------------------

Change the way you maintain your customer database. Experian Intact is the UK\'s leading Internet based data cleansing application.

http://www.experianintact.com/

--------------------------------------------------
Note added at 2002-10-08 07:05:51 (GMT)
--------------------------------------------------

Although commercial data cleansing and standardization software tools have been around for years, until fairly recently they weren\'t suitable for Web applications.

http://www.eweek.com/article2/0,,220591,00.asp?kc=EWAV10209K...

--------------------------------------------------
Note added at 2002-10-08 07:08:52 (GMT)
--------------------------------------------------

Data cleansing takes precedence
Friday 9th August 2002

http://www.it-director.com/article.php?id=3090

--------------------------------------------------
Note added at 2002-10-08 07:09:53 (GMT)
--------------------------------------------------

Data Cleansing Research Project*

--------------------------------------------------------------------------------

Summary:
This research is aimed at defining a framework for automated data cleansing. That is, given a large data set, automatically find and correct errors (semantic and syntactic) within the set. The underlying theoretical aspects of data quality research are being combined with problem solving methods from software testing, data mining, statistics, knowledge based systems, clustering, and machine learning to address this framework. The framework will define an underlying theory to support an accurate set of data quality metrics. A basic understanding of the inherent problems faced by automated data cleansing are being uncovered and investigated.


Technical Reports:

TR-CS-99-02 Progress Report on Data Cleansing 10-18-1999
TR-CS-00-02 Automated Identification of Errors in Data Sets 2-2-2000
TR-CS-00-03 Utilizing Association Rules for Identifcation of Possible Errors in Data Sets 2-28-2000
TR-CS-00-04 Utilizing Association Rules for the Data Cleansing 5-8-2000


http://www.msci.memphis.edu/~maleticj/dataclean.html

--------------------------------------------------
Note added at 2002-10-08 07:10:23 (GMT)
--------------------------------------------------

Data Cleansing Research Project*

--------------------------------------------------------------------------------

Summary:
This research is aimed at defining a framework for automated data cleansing. That is, given a large data set, automatically find and correct errors (semantic and syntactic) within the set. The underlying theoretical aspects of data quality research are being combined with problem solving methods from software testing, data mining, statistics, knowledge based systems, clustering, and machine learning to address this framework. The framework will define an underlying theory to support an accurate set of data quality metrics. A basic understanding of the inherent problems faced by automated data cleansing are being uncovered and investigated.


Technical Reports:

TR-CS-99-02 Progress Report on Data Cleansing 10-18-1999
TR-CS-00-02 Automated Identification of Errors in Data Sets 2-2-2000
TR-CS-00-03 Utilizing Association Rules for Identifcation of Possible Errors in Data Sets 2-28-2000
TR-CS-00-04 Utilizing Association Rules for the Data Cleansing 5-8-2000


http://www.msci.memphis.edu/~maleticj/dataclean.html

--------------------------------------------------
Note added at 2002-10-08 07:10:32 (GMT)
--------------------------------------------------

Data Cleansing Research Project*

--------------------------------------------------------------------------------

Summary:
This research is aimed at defining a framework for automated data cleansing. That is, given a large data set, automatically find and correct errors (semantic and syntactic) within the set. The underlying theoretical aspects of data quality research are being combined with problem solving methods from software testing, data mining, statistics, knowledge based systems, clustering, and machine learning to address this framework. The framework will define an underlying theory to support an accurate set of data quality metrics. A basic understanding of the inherent problems faced by automated data cleansing are being uncovered and investigated.


Technical Reports:

TR-CS-99-02 Progress Report on Data Cleansing 10-18-1999
TR-CS-00-02 Automated Identification of Errors in Data Sets 2-2-2000
TR-CS-00-03 Utilizing Association Rules for Identifcation of Possible Errors in Data Sets 2-28-2000
TR-CS-00-04 Utilizing Association Rules for the Data Cleansing 5-8-2000


http://www.msci.memphis.edu/~maleticj/dataclean.html

--------------------------------------------------
Note added at 2002-10-08 07:11:32 (GMT)
--------------------------------------------------

sorry, this should\'nt have been there three times...

--------------------------------------------------
Note added at 2002-10-08 07:13:03 (GMT)
--------------------------------------------------

Abstract: The paper analyzes the problem of data cleansing and automatically identifying potential errors in data sets. An overview of the diminutive amount of existing literature concerning data cleansing is given. Methods for error detection that go beyond integrity analysis are reviewed and presented. The applicable methods include: statistical outlier detection, pattern matching, clustering, and data mining techniques. Some brief results supporting the use of such methods are given.

http://citeseer.nj.nec.com/maletic00data.html
Peer comment(s):

agree Gillian Scheibelein
13 mins
Something went wrong...
41 mins

Oooops ...

... that was an unfortunate typo in my agree to EB: I meant to say "data cleaning" is normal.

--------------------------------------------------
Note added at 2002-10-08 07:36:11 (GMT)
--------------------------------------------------

P.S. I have specified and implemented data cleaning modules (in an American bank).
Something went wrong...
3 hrs

data cleansing

This is also the standard term used by SAP.
Something went wrong...
+2
7 hrs

data clean-up

not cleansing. It's not unlike 'cleaning up your files'. You don't cleanse them. 'Filtering' is is not a general "Bereinigung", but more of a 'sorting' action.
Peer comment(s):

agree foehnerk (X) : data clean-up the usual term when transferring data frpom (e.g. customer records to a new system, e.g. AS400 to SAP; you delete the records that are no longer valid, or are duplicated with different abbreviations, etc .
35 mins
agree Johanna Timm, PhD
4 hrs
Something went wrong...
13 hrs

Data validation and clean-up on import

that's what I frequently use in DB related translations.
Something went wrong...
Term search
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search