Home > Data Science / AI > Integrating AI and Data Science > The key to Data Science success is the CRISP methodology

Integrating AI and Data Science

The key to Data Science success is the CRISP methodology

11 February 2021 Updated at 15 May 2023

The CRISP methodology (originally known as CRISP-DM), first developed by IBM in the 60s for data mining projects, remains, today, the only truly efficient process used for Data Science projects…

CRISP methodology: User guide

The CRISP methodology includes 6 steps that range from business understanding to deployment and implementation.

An illustration of the CRISP methodology

1. Business understanding

The first step involves acquiring a good understanding of the business elements and issues that Data Science aims to improve or solve.

2. Data understanding

The second step focuses on precisely identifying the data to be analysed, assessing the quality of available data and making connections in order to understand what the data means from a business point of view. Since Data Science is solely data-based, all business issues associated with existing data, whether internal or external, can be addressed through Data Science.

3. Data preparation

The data preparation step groups all activities required to construct, from raw data, a precise set of data to analyse. It thus includes data sorting based on selected criteria, data cleansing, and, most importantly, data recoding to ensure compatibility with any and all algorithms that will be used.

Digital data parametricity and its recoding into categorical data are extremely important and must be executed with the greatest care in order to ensure that used algorithms do not produce inaccurate results during the next step. All data must be centralised in a structured database known as a Data Hub.

4. Modelling

This is the actual Data Science step. Modelling includes selecting, configuring and testing various algorithms, as well as deciding on their sequence, which creates a model. The process is initially a descriptive one that generates knowledge and explains why things happened. It then becomes predictive and explains what will happen, and later prescriptive as it helps optimise future situations.

5. Evaluation

The aim of the evaluation step is to verify any models or knowledge obtained in order to ensure that they meet the objectives identified at the beginning of the process. The evaluation also informs model deployment decisions, or as required, model improvement ones. At this stage, the robustness and accuracy of developed models are tested.

6. Deployment

The final step of the process. It consists in implementing generated models for end users and its aim is to use modelling to format knowledge in such a way that it can be integrated into the decision-making process.

Depending on objectives, deployment can thus range from the simple generation of a report describing knowledge obtained to the installation of an application that helps leverage the obtained model to predict unknown values for an element of interest.

An agile and iterative approach

This is an agile and iterative methodology, i.e. each iteration generates additional business knowledge that helps improve the way the next iteration is handled. This is why, even though we market it as a project, Data Science is more of a global approach than a mere project.

The CRISP methodology has been officially adopted by Business & Decision, and is, without a doubt, a key success factor for Data Science projects.

Business & Decision

Data Scientist – Director of the Data Science & Customer Intelligence offerings at Business & Decision France. Also teaching Data Mining & Statistics applied to Marketing at EPF Schoolg and ESCP-Europe.

Learn more >

Your email address is only used by Business & Decision, the controller, to process your request and to send any Business & Decision communication related to your request only. Learn more about managing your data and your rights.

Data Strategy

Data Governance and Data Management: what's the difference?

In a world where companies' ambition is to be data-driven, data governance and data management are still too often regarded as being synonymous. Let us clear up the confusion. Data...

Premium

Data Governance

REPLAY | Let’s win the Data Mesh Battle: the winning alliance between Data Architecture and Data Governance

The Data Mesh vision has brought to light the various challenges that companies face in managing and effectively utilizing their data. This is not a new challenge, as it has...

Premium

Data Trends

REPLAY | The missing pillars in the Data Mesh approach

Is Data Mesh a utopia? For two years now, the concept of Data Mesh has been seen as a revolution in the world of data since it would fill the...

Premium

Data Strategy

WHITEPAPER | Spiderman guides you towards a data-driven company

There is tremendous enthusiasm for Data Mesh. And for good reason: we finally have a complete framework for valuing data at company level. This white paper offers you a deep...

Data Trends

Data Mesh, a total data-driven model

Through its four main pillars, Data Mesh truly moves away from the dogma of centralisation and all-technology in favor of a global approach based on federation. Data Mesh thus promises...

Data Trends

#Data #AI: 7 hot topics for 2023

The 7 hot topics Data and AI of this 7th edition are the solutions for the performing company. What are specifically the trends and topics to track in 2023? This...

Data Trends

Data Mesh: Practical examples and feedback

Mastering data and its uses to create value is an ambition that is increasingly shared. However, organisations continue to face obstacles that Data Mesh could help to overcome… provided the...

Data Trends

Data Mesh: federated governance to guarantee efficiency

Data governance is an essential part of any data strategy. Nevertheless, it remains complex to deploy in a traditional organisation, but through its federated approach, Data Mesh is able to...

Data Trends

Data infrastructure self-service as the technological driving force behind Data Mesh

Data Mesh is not strictly speaking a technological approach, but data domains need powerful technical resources to develop their products. The data platform and its infrastructure are a facilitator for...

Data Trends

Data Mesh: data is a product

Oil, digital black gold, strategic asset… With Data Mesh, data is regarded as a product. Data domains are responsible for managing the life cycle of these products and for sharing...

Data Trends

Data domains: Data Mesh gives business domains superpowers

The Data Mesh concept is based on four main pillars, the first of which is an organisation divided into data domains. To be effective, this structure must reflect the business...

Integrating AI and Data Science

Data Science and AI: how to properly scope your business projects?

An increasing number of companies are opting for data-driven strategies and embarking on marathon Data Science and Artificial Intelligence projects, in the hope of sharing the benefits of new technologies...

Integrating AI and Data Science

Artificial intelligence, machine learning, data science: are these terms interchangeable?

Many writers talk about AI, machine learning and data science, as if these terms were broadly interchangeable. What’s going on exactly?

Integrating AI and Data Science

Digital transformation, data science and intrapreneurship culture

It’s a fact: digital transformation is here and certainly more than a hype term. The number of neologisms and new concepts attests it: uberization, data scientist, innovation labs, Chief Data/Digital...

The key to Data Science success is the CRISP methodology

CRISP methodology: User guide

1. Business understanding

2. Data understanding

3. Data preparation

4. Modelling

5. Evaluation

6. Deployment

An agile and iterative approach

Discover also

Data Governance and Data Management: what's the difference?

REPLAY | Let’s win the Data Mesh Battle: the winning alliance between Data Architecture and Data Governance

REPLAY | The missing pillars in the Data Mesh approach

WHITEPAPER | Spiderman guides you towards a data-driven company

Data Mesh, a total data-driven model

#Data #AI: 7 hot topics for 2023

Data Mesh: Practical examples and feedback

Data Mesh: federated governance to guarantee efficiency

Data infrastructure self-service as the technological driving force behind Data Mesh

Data Mesh: data is a product

Data domains: Data Mesh gives business domains superpowers

Data Science and AI: how to properly scope your business projects?

Artificial intelligence, machine learning, data science: are these terms interchangeable?

Digital transformation, data science and intrapreneurship culture

Informations sur la gestion de vos données et vos droits