FOSDEM'21 HPC, Big Data, and Data Science Devroom
Sunday, February 7th, 2021, Brussels, Belgium
Welcome to the 6th edition of the HPC, Big Data and Data Science devroom, co-located with FOSDEM 2021. FOSDEM is an annual conference about free and open source software, attended by over 8,000 developers and open-source enthusiasts from all over the world. This devroom is organised by representatives from the HPC and Big Data communities, who are joining forces to bring both communities together.
High Performance Computing (HPC) and Big Data are two important approaches to scientific computing. HPC typically deals with smaller, highly structured data sets and huge amounts of computation while Big Data, not surprisingly, deals with gigantic, unstructured data sets or data streams, usually processed with the help of distributed systems. When the Big Data trend unlocked access to an unprecedented amount of data, Data Science emerged to tackle the problem of creating processes and approaches to extracting knowledge or insights from these data sets. Machine learning and predictive analytics algorithms have joined the family of more traditional HPC algorithms and are pushing the requirements of cluster and data scalability.
Free and Open Source communities have been the foundation of the HPC and Big Data communities for some time. In the HPC community, it should be no surprise that, according to the Top500 supercomputers list, 100% of the supercomputers in the world run Linux. On the Big Data side, the Apache Big Data ecosystem (e.g. Apache Hadoop/Flink/Spark/Kafka) received a tremendous amount of Open Source contributions from a wide range of organizations coming together under the Apache Software Foundation.
Our goal is to bring the communities together, share expertise, learn how we can benefit from each other’s work and foster further joint research and collaboration. We welcome talks about Free and Open Source solutions to the challenges presented by large scale computing, data management and data analysis.
The HPC, Big Data, and Data Science devroom will take place on Sunday February 7th 2021.
Unfortunately this time FOSDEM will only take place online, and not physcially in Brussels as usual, due to the ongoing COVID-19 pandemic (see also the CfP at the FOSDEM website).
Join us to enjoy a full day of talks, demos and interesting discussions on open-source HPC, Big Data and Data Science.
Sounds interesting? Submit your talk proposal below and see you in Brussels!
Topics of interest include, but are not limited to:
- Architecture and design of High Performance Computing (HPC) and Big Data systems
- Architecture and design of Extract, Transform and Load (ETL) and data acquisition pipelines
- Data security and governance
- Tools and technologies related to HPC and computational science, for example:
- Multithreading (OpenMP, etc.)
- Distributed computing (MPI, etc.)
- GPGPU computing (OpenCL, OpenACC, etc.)
- Parallel filesystems and storage
- Large-scale performance analysis and debugging
- Computational paradigms for Big Data systems
- MapReduce engines
- Streaming engines
- SQL engines
- Dataflow engines
- Emerging hardware trends of large scale clusters
- Large scale memory pooling
- High-speed interconnects
- ARM cluster architecture
- System administration of HPC and Big Data clusters
- User support tools
- Machine learning libraries and tools
- Scientific software applications, tools and libraries (across all scientific domains)
- Big Data platforms, extensions to existing systems, libraries, APIs
- Experience reports on using Big Data systems, for example:
- Large-scale deployments
- Development and configuration issues
- Tuning and performance tips and lessons learned
- Interesting Big Data use-cases and applications
- Comparative analysis of existing systems, evaluation results, performance studies
- Interdisciplinary HPC/Big Data use-cases, for example:
- Applications using both HPC and Big Data technologies
- Integration issues
- Open research problems on the convergence of HPC and Big Data
- Running MPI jobs on Big Data clusters and vice-versa
We invite presenters to submit talk proposals to present high-quality work with sufficient background material to be clear to the HPC, Big Data, and/or Data Science communities. Talk proposals should be submitted through the FOSDEM Pentabarf server. Submissions must include:
- Abstract (plain text, couple of paragraps)
- Session type
- Session length
- Expected prior knowledge / intended audience
- Speaker bio
- Links to code / slides / material for the talk (optional)
- Links to previous talks by the speaker
Our intention is to have a full day of talks of about 20 minutes each, with an additional 5-10 minutes for questions by attendees.
We would also like to note:
- All accepted talks will be about (using) free and open source software.
We highly discourage “marketing” talks.
- Due to the online format of FOSDEM’21, accepted talks will have to be pre-recorded.
- Final presentations, including recording, will be due by mid January 2021.
- A devroom volunteer will work together with the speaker to ensure a high-quality recording is available in time.
- Talks will be streamed at the planned time slot on Sunday February 7th 2021 (following Brussels time).
- Speakers are expected to join their session for live Q&A with attendees after the talk.
When submitting your talk in Pentabarf, make sure to select the ‘HPC, Big Data, and Data Science Devroom’ as the ‘Track’.
If you already have a Pentabarf account from a previous FOSDEM edition, please reuse it
. Create an account if, and only if, you don’t have one from a previous year. If you have any issues with Pentabarf, do not despair: contact hpc-bigdata-devroom [at] lists.fosdem.org .
Call for participation available: Monday Nov 30th 2020
Call for participation closes: Fri Dec 18th 2020 – no further extensions!
Devroom schedule available: Wednesday Dec 30th 2020
Devroom date: Sunday February 7th 2021
If you would like to create an associated event for the devroom, please fork the page and send a pull request.
- John Dey (Fred Hutchinson Cancer Research Center, US)
- Bob Dröge (HPC team at University of Groningen, The Netherlands)
- Chris Edsall (HPC team at University of Bristol, UK)
- Todd Gamblin (Lawrence Livermore National Laboratory, US)
- Fotis Georgatos (EPFL, Switzerland)
- Andy Georges (HPC team at Ghent University, Belgium)
- Sharan Kalwani (DataSwing Corporation LLC, Austin (TX), US)
- Christian Kniep (AWS)
- Jan-Patrick Lehr (Scientific Computing at TU Darmstadt, Germany)
- Bart Oldeman (Compute Canada)
- Ward Poelmans (HPC team at Vrije Universiteit Brussel, Belgium)
- Åke Sandgren (HPC team at Umeå University, Sweden)
- Davide Vanzo (Microsoft Azure HPC)
Please, take a moment to read the FOSDEM Code of Conduct.