All posts from Stéphane WALTER
TUTORIEL | Spark Structured Streaming: performance testing
Spark is an open source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python) and has rapidly carved out a significant niche...
TUTORIEL | Spark Structured Streaming: from data transformation to unit testing
Spark is an open-source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python). It has rapidly carved out a significant niche in...
TUTORIEL | Spark Structured Streaming: from data management to processing maintenance
Spark is an open source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python) and has rapidly carved out a significant niche...
DataOps: data specification and documentation recommendations for Big Data projects
To exploit the full potential of Big Data projects, proper data documentation is essential. DataOps principles help set up an adequate approach - a prerequisite for the success of all...
[TUTORIAL] First steps with Zeppelin
Zeppelin is the ideal companion for any Spark installation. It is a notebook that allows you to perform interactive analytics on a web browser. You can execute Spark code and...
Tutorial: How to Install a Hadoop Cluster
You have read many articles on Hadoop and now you want to get familiar with it, but how do you install and apply this new technology? The recommended approach is...