Data tutorials, tools and languages
How to develop and properly exploit the functionalities of Data tools and languages?
Querying, indexing, aggregating, updating, analyzing, etc. Our experts provide you with a concrete approach to the possibilities of applying and deploying the main Data tools and languages, through a series of tutorials.

Spark Structured Streaming: performance testing
Spark is an open source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python) and has rapidly carved out a significant niche in Big Data projects thanks to its ability to process high volumes of data in batch and streaming mode. Its 2.0 version introduced us to…

Spark Structured Streaming: from data transformation to unit testing
Spark is an open-source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python). It has rapidly carved out a significant niche in Big Data projects thanks to its ability to process high volumes of data in batch and streaming mode. Its 2.0 version introduced us to a…

Spark Structured Streaming: from data management to processing maintenance
Spark is an open source distributed computing framework that is more efficient than Hadoop, supports three main languages (Scala, Java and Python) and has rapidly carved out a significant niche in Big Data projects thanks to its ability to process high volumes of data in batch and streaming mode. Its 2.0 version introduced us to…

Color in Dashboarding: a Love-Hate relationship
When it’s time to add some colors in your dashboards, it can easily get complicated to make the good choice for an understandable result by all. Colors have a strong impact on your dashboard, they have a meaning and need to be used wisely for an effective result. In this on-demand webinar, Jean-Philippe Favre, Data…

How to create efficient data storytelling dashboards?
Designing dashboards today seems a very simple task, thanks to modern BI tools. However designing efficient dashboards that are useful and that people want to use is not the same story. A good dashboard must follow specific rules that our Data Artists experts explained in this webinar available in replay. How to tell Data-Stories in…

How to use the Python integrator in PowerBI?
The Python integration in Power BI is a huge step forward from Microsoft. It opens a wide range of possibilities in terms of extracting and cleaning your data as well as creating nice-looking and full customized visuals. Let’s see how it works and how to set-up your Python environment in your Power BI Desktop. As…

[TUTORIAL] First steps with Zeppelin
Zeppelin is the ideal companion for any Spark installation. It is a notebook that allows you to perform interactive analytics on a web browser. You can execute Spark code and view the results in table or graph form. To find out more, follow the guide!

Tutorial: How to Install a Hadoop Cluster
You have read many articles on Hadoop and now you want to get familiar with it, but how do you install and apply this new technology? The recommended approach is to install a turnkey virtualized machine supplied by a major publisher.