• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Codemotion Magazine

We code the future. Together

  • Discover
    • Events
    • Community
    • Partners
    • Become a partner
    • Hackathons
  • Magazine
    • Backend
    • Frontend
    • AI/ML
    • DevOps
    • Dev Life
    • Soft Skills
    • Infographics
  • Talent
    • Discover Talent
    • Jobs
    • Manifesto
  • Companies
  • For Business
    • EN
    • IT
    • ES
  • Sign in
ads

Piero SavastanoDecember 13, 2019

Exploring the universe with project HelloExoWorld

Big Data
Exploring the universe with project HelloExoWorld
facebooktwitterlinkedinreddit

Humans are capable of walking on the moon. We can operate an International Space Station, and sending rovers onto the surface of Mars. We can launch hundreds of satellites into the unknown to gather data. We explore the universe, both personally and by using machines through initiatives such as HelloExoWorld.

Our quest to explore the galaxy involves many skilled individuals. They come from different backgrounds, combining different approaches and experimental techniques to solve complex questions. They wonder, Are we alone in the universe? Will we become an interplanetary species? Horacio Gonzalez presented a talk about HelloExoWorld at this year's Codemotion Rome 2019, an initiative that brings together space exploration, big data, and open source software.

The Kepler Project and NASA datasets

Artist's impression of Kepler space telescope - a mighty part of Hello Eco World
Artist's impression of Kepler space telescope

There are many satellites exploring space, moving silently in the cold, dark air, gathering data for research. One of the most famous is the Kepler space telescope, launched in 2009 by NASA and officially operating until 2018.

Kepler can be considered a hero robot, while orbiting around the sun and staring into the Milky Way (our home galaxy), it collected brightness data on over 150000 stars. Kepler examined stars with the aim of observing orbiting planets, ultimately helping discover inhabitable planets and – maybe – alien forms of life.

The United States are not keeping the data for themselves and, following the noble values of open data, they released around 25 terabytes (!!!) of data recorded from Kepler. Data is freely available at the NASA open data portal.

The Power of Project HelloExoWorld

measuring the brightness of stars as part of HelloExoWorld
Measuring the brightness of stars

How data is used as evidence of new planets

One of the most valuable approaches is the “transit method”. To put it simply, we look at how the brightness of a star changes over time, and we search for short and repeated periods of time in which the brightness decreases. This regular drop in brightness is evidence of an orbiting planet around the star. When a planet orbits around a star, it obfuscates part of the light reaching the telescope, so we observe less brightness for a while.

We should imagine stars with many orbiting planets, so the brightness track can be quite messy and hard to segment. The more planets orbit a star, the more the star is interesting, the harder the brightness profile is to analyse.

The Anomaly Detection Problem

The transit method translates in a time series analysis problem, and, more specifically, in an anomaly detection problem. Gonzalez described the main phases of their approach to anomaly detection. The starting point is a series of brightness values in time, regarding a specific observed star. First step is to eliminate noise by downsampling, for example with a rolling average. Then subtract the smoothed curve from the original one, so spikes in brightness are more evident.

On the technical side, this kind of analysis runs with the help of an open source tool named WARP10, specifically designed to deal with time series on massive amounts of data. WARP10 was initially developed at OVH, as a cloud infrastructure monitoring tool. It was then open sourced and also put to use on open science challenges. Among WARP10’s time series analytics functions there are: moving averages, ARMA, Markov hidden models, Fourier transforms and entropy encoding.

When scientists and cloud engineers team up

An interesting aspect of project HelloExoWorld is the synergy between NASA scientists and OVH engineers. Scientist are excellent at writing complex time series algorithms, taking into account astrophysics and going into detailed analysis, statistics and visualisations of the Kepler dataset. On the other front, cloud engineers are excellent at dealing with massive amounts of data, scaling both hardware and software, and keeping such a huge system monitored and resilient.

Horatio detailed:

“We scan big amounts of data using standard time series analytics, and identify interesting planets which NASA scientists can study deeper with advanced algorithms.”

That’s the perfect example of collaboration between research and industry, reaching a delicate equilibrium between scientific questioning and a technological solidity.

Project HelloExoWorld is a glimpse into the unknown universe, but also a glimpse into the future. What is unique about this endeavour in my view, is a combination of community values, openness (open data, open science and open source are involved), space exploration, big data, cloud computing and, most of all, dreaming big. A project in which science and industry give each other their strengths, curiosity is the main driving value and looking at the stars is both romantic and useful.

Take a look at Horatio's slide deck. f you're a fan of space exploration, you might enjoy our celebrations of the 50th anniversary of the moon landing, from this year's Codemotion Milan.

Related Posts

Applied Data science, machine learning, debugging

Data Science in Action: Real-World Use Cases and Success Stories 

Codemotion
February 22, 2024
Logical data warehouse vs traditional data warehouse. This article explores the advantages of logical data warehouses.

Logical Data Warehouses vs. Traditional Data Warehouses

Codemotion
July 20, 2023

MapReduce Not Dead: Here’s Why It’s Still Ruling in the Cloud

Codemotion
March 7, 2023
apache kafka

Is Apache Kafka Still Relevant?

Codemotion
December 12, 2022
Share on:facebooktwitterlinkedinreddit

Tagged as:Codemotion Rome

Piero Savastano
I come from pure research in bioinspired robotics and now work as an independent data scientist. I want to help developers understand what is machine learning and how they can use it.
Artificial Intelligence and Stupidity: can robots be smart?
Previous Post
Funding for your project: how Codemotion can help developers with project Agora
Next Post

Footer

Discover

  • Events
  • Community
  • Partners
  • Become a partner
  • Hackathons

Magazine

  • Tech articles

Talent

  • Discover talent
  • Jobs

Companies

  • Discover companies

For Business

  • Codemotion for companies

About

  • About us
  • Become a contributor
  • Work with us
  • Contact us

Follow Us

© Copyright Codemotion srl Via Marsala, 29/H, 00185 Roma P.IVA 12392791005 | Privacy policy | Terms and conditions