NITheCS Colloquium: Prof Mattia Vaccari, The Ilifu Cloud Computing Facility & X-Informatics Data Intensive Research

Monday, 7 June, 4pm.

NITheCS (National Institute for Theoretical and Computational Sciences) is hosting a colloquium featuring UWC’s Director of the eResearch Office, Prof Mattia Vaccari.


Over the past decade, the global science enterprise has been transformed by the data generating capabilities of our instruments. Distributed science collaborations creating datasets too large to manage for individual researchers are becoming the norm, and in response, X-Informatics, or the application of data science techniques to different science fields, has evolved into a new and exciting field of applied computer science. In this new big data era, institutions and national communities that have the capacity to design and implement the solutions to effectively extract knowledge from data will play a lead role in science. Those that do not, will not.

The Ilifu project was set up to address this challenge in Astroinformatics and Bioinformatics. Ilifu is building cross-disciplinary teams to undertake research and development in technologies and big data science to build capacity for South African researchers to be globally competitive in the era of big data.

In this presentation, Prof Vaccari will talk about Ilifu, its partnership model, goals and research programs, with a particular focus on multi-wavelength galaxy evolution studies, and outline a vision for a federated South African Data Intensive Research Cloud that empowers researchers to work with and collaborate on big data science projects.

Register for the event, or read more.

Five Years of FAIR

The Future of FAIR

Five years since the formal publication of the FAIR data principles, a newly published white paper, Springer Nature’s The Future of FAIR, looks at the real-world impact of FAIR. An international cohort of research data professionals share their opinions and offer commentary on the impact of the FAIR data principles to date, as well as what the next steps in research data management are.

Varsha Khodiyar on the Springboard blog writes that the “burgeoning open science (or open research) movement aims to make public and charity funded research as transparent and accessible as possible, and available for use to all to use, extend and build on. Research data is central to this vision, and to many policies and initiatives launched in recent years encouraging the adoption of open science practices. The impact of the FAIR data concept on open science advocates, position statements, policies and funding opportunities is unmistakable”.

Read more about the FAIR data principles, find out more about the study, and download The Future of FAIR white paper.

eWorkshop: Command Line Interface for Genomics Beginners

Forensic DNA Lab UWC

The Forensic DNA Lab (FDL, UWC) will be running an eWorkshop (online workshop) on using the Command Line Interface, Unix, shell and other tools for genomics. 

The course will run from 10 June to 15 July with once a week lessons.  The course is aimed at graduate students and research scientists who will work with genomic and bioinformatic datasets for the first time. We will help attendees get started in using the CLI for performing genomic workflows. Attendees require no previous experience in CLI tools.

More about the eWorkshop

Command line interface (CLI) and graphic user interface (GUI) are different ways of interacting with a computer’s operating system. The CLI allows you to control your computer using commands entered with a keyboard instead of controlling graphical user interfaces (GUIs) with a mouse/keyboard combination.

The CLI is important for proficiency in genomics as most bioinformatics tools use the shell and have no graphical interface. Importantly, CLI is essential for using remote high performance computing centers e.g. ILIFU, CHPC.

After the course, participants should be able to:

  1. Discuss practical differences between Unix and Windows;
  2. Navigate and manipulate files and folders using standard bash commands;
  3. Write basic scripts for bash including piping between commands;
  4. Access the ILIFU HPC and submit simple scripts to SLURM; and
  5. Discuss folder/directory structure for genomic projects.

The course registration is now closed.

Research Opportunity Announcement: Data Generation Projects for the Bridge to Artificial Intelligence (Bridge2AI) Program

The NIH (US National Institute of Health) Common Fund’s Bridge to Artificial Intelligence (Bridge2AI) program is designed to help propel biomedical research forward by setting the stage for widespread adoption of artificial intelligence (AI) and machine learning (ML) that tackles complex biomedical challenges beyond human intuition. It is a new NIH Common Fund program, and will tap into the power of AI to lead the way toward insights that can ultimately inform clinical decisions and individualize care. AI, which encompasses many methods, including modern machine learning (ML), offers potential solutions to many challenges in biomedical and behavioral research.

The Bridge2AI program plans to support several interdisciplinary Data Generation Projects (OTA-21-008) and one complementary cross-cutting Integration, Dissemination and Evaluation (BRIDGE) Center (NOT-RM-21-021) to generate flagship data sets and best practices for the collection and preparation of AI/ML-ready data to address biomedical and behavioral research grand challenges. 

It also plans to support the formation of teams richly diverse in perspectives, backgrounds, and academic and technical disciplines. The current Research Opportunity Announcement (ROA) for Data Generation Projects for the Bridge to Artificial Intelligence (Bridge2AI) Program (OT2) (OTA-21-008) requires a Plan for Enhancing Diverse Perspectives (PEDP)—a summary of strategies to advance the scientific and technical merit of the proposed project(s) through inclusivity. Visit the Bridge2AI Program Resources page and Program FAQs for additional information on building diverse teams and for PEDP guidance.    To facilitate team building across communities and ensure responsiveness of proposals, NIH strongly encourages potential proposers to participate in the Grand Challenge Team Building Activities taking place in June 2021, please save the date for these upcoming events:    

Bridge2AI Program Town Hall
June 9, 2021
2:00-3:30pm ET
Bridge2AI Data Generation Project Module Microlabs
June 14, 16, and 18, 2021
2:00-4:00pm ET each day
Bridge2AI Grand Challenge Team Building Expo
June 23, 2021
11:00am-5:00pm ET

Further information about how to register and participate in these events, as well as an online networking platform, will be coming soon. Please check the Bridge2AI Scientific Meetings page for updates.    Please refer to the research opportunity announcement (OTA-21-008) for additional information on application submission and review. A Letter of Intent (LOI) is required, LOIs must be emailed to by 11:59 PM ET on or before July 20, 2021.   We encourage you to share the Bridge2AI listserv signup with your contacts and networks so they will receive updates on future funding announcements and the latest news from the Bridge2AI program. You can also keep up to date with the latest information by visiting the Bridge2AIwebsite. Questions can be sent to

Read more about the vision for this new program in a recent NLM Director’s blog

UWC Research Data Management (RDM) Practices and Needs Analysis Survey

UWC Library Services and the eResearch Office have created a short survey to gather information about Research Data Management (RDM) practices and needs at UWC. The aim of this survey is to identify current RDM practices with a view towards establishing data management services and guidance for researcher communities at UWC. All UWC faculty, staff, researchers and students are invited and encouraged to participate.

RDM is the process of organising and documenting data processes (collection, description, curation, archiving and publication) within a research project throughout the research life-cycle. Well-managed data leads to coherent, shareable and reusable research, and practicing good RDM means that researchers can achieve far more efficiency with their data.

The aim of this survey is to establish what kinds of research data you collect, where such data is held, and how it is being managed. The purpose is to identify current RDM practices with a view towards establishing data management services and guidance for researchers at UWC.

Participation is voluntary and no personal information will be requested that might identify individuals. The survey is strictly anonymous, and participants are free to withdraw from the research at any time.

The data will be stored on an internal server in the Kikapu data repository with controlled access. This information will be used to formulate improved institutional procedures for managing and curating research data.

The survey can be accessed at

It will not take longer than 15 minutes to complete, and we kindly ask participants to complete it by May 31st.

If you have any questions or concerns about the research, please feel free to contact us at

Please complete the survey.

If there are any questions, please contact

Research Data Management Short Course

H3ABioNet (Pan African Bioinformatics Network for the Human Heredity and Health in Africa) is offering a short course in Research Data Management (RDM) in June 2021. The course is aimed at graduate students and biomedical scientists who are currently working on clinical genomics and bioinformatics projects in Africa, and registration closes on 24 May. It will take place over four days from 22-25 June from 10:00 to 14:00.

About the Course

The Research Data Management (RDM) short course will introduce the principles and practices of RDM and provide practical advice for implementing these practices in an African research context. Nicky Mulder is Principal investigator of H3ABioNet, and leads UCT’s Computational Biology (CBIO) group which is an ilifu partner.

Topics covered will include data discovery and re-use, data documentation and organization, data standards and Ontology, data storage and security, repositories and policies, FAIR & reproducibility and best practices in developing an effective Data Management Plan.

After the course, participants should be able to:

  1. Understand what research data management is;
  2. Recognize why research data management is necessary;
  3. Understand best practices and aspects for research data management; and
  4. Have knowledge of the RDM tools available at your institution and online.

The course will only provide a foundation for continued learning in research data management and will not teach any advanced RDM aspects. 

Find out more and apply.

UWC’s first Data Carpentry Workshop of 2021

From 12-16 April, the eResearch Office hosted UWC’s first Data Carpentry workshop of 2021. It was an online workshop held over five mornings, and was attended by over 20 researchers. The workshop was aimed at students and researchers who want to start learning how to work with their data, and was sponsored by SADiLaR.

The eResearch Office promotes and supports the use of advanced information technologies to enable better, faster and higher-impact research, and we hope to grow the Carpentries community at UWC.

Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who have little to no prior computational experience, and its lessons are domain specific, building on learners’ existing knowledge to enable them to quickly apply skills learned to their own research. Participants are encouraged to help one another and to apply what they have learned to their own research problems.

Lessons included data organising and cleaning in spreadsheets and with OpenRefine, and data analysis and visualisation with R and RStudio. 

Please contact if you would like to be added to the UWC Carpentries mailing list.

Webinar: RDM Tools available at UWC and Unpacking UWC’s RDM Policy

On Wednesday 21 April, a webinar was hosted by the eResearch Office and UWC Library, titled RDM Tools available at UWC and Unpacking UWC’s RDM Policy.

Mark Snyders, Manager of Scholarly Communication at the Library, spoke in detail about UWCs new RDM Policy, and how it affects researchers at UWC. Sarah Schäfer, Research Data Specialist at the eResearch Office presented on the different RDM tools and resources that are available to the UWC research community.

The session was hosted by eResearch Office Director, Prof Mattia Vaccari and Deputy Director of UWC Library Services Alfred Nqotole. The presentations were followed by questions and a discussion.

Please send any further questions to

Watch a recording of the webinar.

Carpentries Instructor Training Workshop

North-West University (NWU) is hosting a Carpentries Instructor Training online workshop on 17-21 May 2021. Interested applicants can apply through the website.

The Carpentries project comprises communities of Instructors, Trainers, Maintainers, helpers, and supporters from Software CarpentryData Carpentry and Library Carpentry who share a mission to teach foundational computational and data science skills.

The workshop aims to:

  • Introduce participants to evidence-based best-practices of teaching.
  • Teach participants how to create a positive environment for learners at workshops.
  • Provide opportunities to practice and build teaching skills.
  • Help participants become integrated into the Carpentries community.
  • Prepare participants to use the teaching skills in teaching Carpentries workshops.

The training is free, although, in order to avoid attendee space being wasted, a penalty will be imposed on applicants who commit to the workshop and miss it.