Introduction to Interactive Data Analytics for New Users

What is the Interactive Data Analytics Service?

The Interactive Data Analytics Service (IDAS) is provided by Information Technology Services - Research Services (ITS-RS). IDAS supports large-scale and collaborative data analytics using interactive tools such as RStudio for R and Jupyter Notebook for Python, R, and Julia. Users will benefit from High Performance Computing (HPC) resources while performing their interactive data analysis tasks.

What are Python, R, and Julia?

PythonR, and Julia are among the most popular programming languages for data analysis. Python and Julia are general-purpose languages, and R is a language for statistical computing and graphics. Python, R, and Julia are open-source projects and free for everyone to use, although some companies provide commercial support and/or extensions for their customers. While Python, R, and Julia share many important features as high-level, open-source programming languages, they each have their own strengths and weaknesses from a data-science perspective.

What is interactive data analytics?

Interactiveness is one of the key features of data analytics. Due to the nature of data analysis, it is challenging to write perfect code on the first try. You write and run some code, see the output, and then based on that output write and run another code, see the output, so on and so forth, until you get the final, satisfactory outcome of your data analysis. That process is called interactive data analytics. 

What are Jupyter Notebook and JupyterHub?

Python support iPython, which stands for interactive Python, for interactive scripting in a shell environment. The Jupyter Notebook is an open-source web application built on top of iPython. It allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses of the Jupyter Notebook include data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. The Jupyter Notebook supports not only Python but also many of the other commonly-used programming languages such as R, Java, and Julia.

JupyterHub is a server version of the Jupyter Notebook for multiple users. It can be used in a class of students, a corporate data science group or scientific research group. It is a multi-user Hub that spawns, manages, and proxies multiple instances of the single-user Jupyter notebook server.

What is RStudio?

RStudio is a free and open-source integrated development environment (IDE) for R. RStudio Desktop is a standalone desktop application that works with the version of R you have installed on your local computer. RStudio Server is a Linux server application for multiple users that provides a web-based interface to the version of R running on the server. 

Okay! You've convinced me. I want to use IDAS. How do I start?

We are truly glad that you want to give IDAS a try.

Who do I contact if I have questions?

Please reach out to Cody B Johnson and Giang Rudderham at research-computing@uiowa.edu.