Bovi-Analytics
  • Miel Hostens
  • Researchers
  • Open positions
  • Portal
  • Projects
  • Tutorials

On this page

  • Programming languages
  • Local processing (when data fits on-prem)
    • Development environments
  • Clould processing (when data doesn’t fit on-prem)
    • Data processing framework
  • Others tools

Commonly used tools

Which tools are commony used by collaborators

Programming languages

In the early ages of my programming I started programming in VB.NEt, did most of my statistics in SAS for most of my PhD studies. However, as open source became mainstream I have switched (maybe to often). Nowadays, I tend to advice on using several languages. Once you know 1, switching is more easy. And there is always StackOverflow!

  • [R] when using smaller datasets that need quick and dirty frequentist statistics. Often rendered using Rmd notebooks and pushed to Github, e.g.

    • Salamone et al
  • Python and PySpark for many of the neural net studies, e.g.

    • van Leerdam et al
  • Scala language when performing more complex data engeneering projects, e.g.

Local processing (when data fits on-prem)

Development environments

For my local processsing, I combine multiple IDE’s. I tried one-size-fits-all approaches such as visual studio but ended up using the following options.

  • Python -> Pycharm -> Jupyter notebook

  • [R] -> R Studio

  • Scala -> IntelliJ IDEA

Clould processing (when data doesn’t fit on-prem)

Data processing framework

  • Python and [R] can be easily used through Google Colab.

  • Python, [R] and scala can be performed on Databricks Community for students. Apply for the community edition here, be sure not the click any of the providers (Google, Amazon, Azure) but select the ‘Get started with Community Edition’.

  • When in need of larger scale processing power I love to use Apache Spark for distributed analysis and parallel processing using the Apache Spark on Azure Databricks.

  • Microsoft Azure as current cloud platform.

Others tools

  • Github as code repository, this website is hosted through github pages.
  • This website is build using Quarto, a new tool integrated well in R studio and others.
  • Markdown ( https://www.markdownguide.org/ ).

Content 2024 by Miel Hostens
All content licensed under a Creative Commons Attribution-NonCommercial 4.0 International license (CC BY-NC 4.0) and MIT License

 

Made with and Quarto
View the source at GitHub