Basics of R: a Training Module

In collaboration with DPGS, a ‘Basics of R’ training module will begin on 5 July. The module will include an introductory session, after which training materials will be available through iKamva. The purpose of this training module is to equip postgraduate students with basic knowledge and skills to begin using R.

R is a programming language that is widely used by researchers in various disciplines for data manipulation, calculations and graphical display and visualisations. Anyone who would like to begin working with R as part of their research or skills development will benefit from this training module, and the lessons are aimed at people with no previous experience.

Working with Data Training Module Q&A Session

The Working with Data: Training Module, created in collaboration with DPGS, is still available on iKamva, and the Q&A session on Wednesday 8 June is an opportunity to ask questions about spreadsheets and OpenRefine.

The session is also for those who would like to begin working on the module during the vacation period.

ilifu-supported Africa CDC course at SANBI

SANBI are currently teaching a course on SARS-CoV-2 (COVID-19) bioinformatics to visitors from public health labs across Africa. The course is taking place at the University of the Western Cape from the 23 to 27 May 2022, and is using ilifu, South Africa’s big data infrastructure for data-intensive research. 

Africa CDC aims to strengthen capacities and capabilities at public health institutions in Africa in order to detect and respond quickly and effectively to disease threats and outbreaks. They have data-driven interventions and programmes, and SANBI has been working with Africa CDC since 2018. 

In 2020, the Africa CDC launched its Institute of Pathogen Genomics (IPG), which has been at the forefront of supporting SARS-CoV-2 sequencing on the African continent. SANBI is one of the specialist centres assisting Africa CDC in its work developing pathogen genomics and bioinformatics. As part of this role, they are running a week-long course on SARS-CoV-2 sequence analysis for people from public health labs in 9 African countries – Morocco (Institut Pasteur du Maroc), Egypt (Central Public Health Laboratory), Ethiopia (Ethiopian Public Health Institute), Uganda (Center Public Health Institute), Kenya (National Public Health Laboratory), Senegal (Institut Pasteur de Dakar), Zambia (Zambian National Public Health Institute), Ghana (Noguchi Memorial Institute for Medical Research) and South Africa (National Institute for Communicable Diseases). Nigeria (Nigeria CDC) and DRC (Institut National de Recherche Biomédicale) could not attend in-person, but are participating online. 

Back row (L-R): Ziphozakhe Mashologu, Wael Saif, Harris Onywera, Peter van Heusden, Alan Christoffels, Leonard Kingwara, Ayitewala Alisen, Abebe Negeri, Abdelmajid Eloualid, Amadou Diallo. Front row (L-R) Susan Alicia Fernol, Michelle Lowe, Annie Chan, Tracey Calvert-Joshua, Quaneeta Mohktar, Francis Ahiakpah, Moussa Diagne. Not present: Mpanga Kasonde, Akil Prince, Emmanuel Lokilo . 

SANBI have been ilifu partners since its inception, and Peter van Heusden, Senior Bioinformatician at SANBI, is one of the organisers of the workshop. He says that participants have found it to be “an excellent resource to support public health bioinformatics”. ilifu is a node in the South African national data infrastructure which enables South African researchers to be leaders in the strategic science domains of astronomy and bioinformatics. 

Course participants working on data analysis in SANBI’s Aaron Klug seminar room, 23 May 2022.

The training relies on cloud infrastructure provided by Ilifu – each lab gets their own installation of SANBI’s SARS-CoV-2 Workbench to work on, and gets hands-on experience with uploading data, doing data analysis and visualising their results. While the current training has focused on SARS-CoV-2, the discussions have ranged across a number of other infectious diseases that these public health labs are responding to: HIV, TB, hepatitis, malaria, influenza and other pathogens. SANBI sees this training as feeding into Africa CDC’s efforts to build a Community of Practice in public health bioinformatics and genomic surveillance.

Training Opportunity : Data Science and Machine Learning

A training opportunity for data science and machine learning is available and registration closes on 25 May.
Makerere University School of Public Health (MakSPH) in Kampala, Uganda, is working in partnership with four other African universities to research the COVID-19 response in Africa. Through the collaboration, a community of practice (COP) has been establised. It is aimed at developing the capacity of African institutions to prepare, analyse and respond to disease epidemics successfully.
As part of the COP, and in partnership with IBM Research Africa (IBMRA) scientists, who have expertise in artificial intelligence (AI), data science, and machine learning, the project has organised a capacity-building opportunity on data science and machine learning.
Participants interested in or with a background in data science, artificial intelligence, machine learning and cloud computing are encouraged to register. There is no minimum skill requirement aside from computer literacy.

Topics that will be covered include analysing the impact of COVID-19 on essential health services using time series analysis; learning from COVID-19 models to support what-if scenario analysis; and intervention planning and descriptive statistics to analyse NPIs implemented in African countries.

The training will be conducted online through Webex, with an expected engagement of about 2 hours every week. Facilitation will be in both English and French, and the programme will run until October 2022.

EMPOWER: Digital Humanities Programme

Escalator has created an 8-step programme for to learn new digital technologies and skills that enhance research.

The programme is specifically targeted at womxn in humanities and social sciences in South Africa who want to learn and enhance their digital and computational research skills. Folks from any career stage are welcome to join. This includes researchers, postgraduate students, postdoctoral research fellows, librarians, IT staff, research support positions, and related areas. The programme caters for any experience level: from complete novices to those with advanced skills.

Escalator is an exciting addition to the South Africa Digital Humanities (DH) landscape. It launched last year, and has been working closely with members of the community to understand the needs for upskilling and re-skilling and learning new tools/technologies and methodologies to enhance research in an increasingly digital world. They are excited to announce the launch of this programme, based on intentional learning principles, through which they will support learning and growth.

The first event takes place on 19 May 2022 from 11:30 – 13:00.
The session will be recorded and shared afterwards.

Working with Data Training Module now active on iKamva

After a successful introductory session on 11 May, the Working with Data: Training Module, created in collaboration with DPGS, is now live on iKamva. For those that missed the introductory session, the recording is also available on iKamva.

The purpose of the training module is to equip postgraduate students with the basic knowledge and skills needed to clean and organise their data using spreadsheets and OpenRefine. The lessons are based on Data Carpentry lessons.

The lesson materials will be available until 8 June.

Opportunity: The Carpentries are Recruiting Instructors

An ongoing opportunity to build teaching skills as part of a global community is available.

The Carpentries are actively recruiting Instructors to teach Centrally-Organised workshops. These workshops (currently being held online) are a great way to connect with a global community, meet new colleagues with shared interests, and share skills with researchers around the world. The Carpentries are currently offering priority admission to our Open Instructor Training program for applicants who indicate interest in teaching centrally-organised workshops.

There is limited space and participation bursaries, valued at ~R6000, are available.
Event dates: 1 – 3 June 2022 (2.5 days)

All applicants are welcome. No specific expertise is necessary, but they do expect that trainees will have the technical knowledge necessary to teach one or more of the core lessons from Data CarpentryLibrary Carpentry, or Software Carpentry. Instructor Training events are held online, so anyone with internet access and time to share can participate.

A challenge that we face in the Humanities is providing our students (and colleagues) with an opportunity to learn about computational approaches that they can apply in their current and future contexts. This instructor training opportunity will provide training to  provide the participant with foundational lessons regarding: 

  • Evidence-based teaching practices.
  • Teach you how to create a positive environment for learners at your workshops.
  • Provide opportunities for you to practice and build your teaching skills.
  • Help you become integrated into the Carpentries community.
  • Prepare you to use these teaching skills in teaching Carpentries workshops.

More Data Openness in NIH Policy

In what has been described ‘seismic‘, the NIH’s (US National Institutes of Health) new data-sharing policy mandates that all researchers share their data. The NIH is the largest public funder of biomedical research in the world, and this shift could set a global standard for biomedical research.

In January 2023, the US National Institutes of Health (NIH) will begin requiring most of the 300,000 researchers and 2,500 institutions it funds annually to include a data-management plan in their grant applications — and to eventually make their data publicly available.

Nature, 16 February 2022

This certainly is groundbreaking news in a research landscape that has seen a steady albeit slow progression toward more openness. Mark Hahnel, founder of Figshare, agrees that this is huge news. He urges the the academic community to not lose focus on potential benefits that open data can have “for reproducibility and efficiency in research, as well as the ability to move further and faster when it comes to knowledge advancement”.

The policy, which applies to research funded by or conducted by NIH that results in the generation of scientific data, establishes the requirements of submission of Data Management Plans (DMPs), and it also emphasises the importance of good data management (RDM) practices. This includes maximizing the appropriate sharing of scientific data generated from NIH-funded or conducted research, with justified limitations or exceptions.

There is no doubt that this policy will be felt globally, by researchers and academic institutions.

Read the full NIH Policy here.

Find out more about Research Data Management (RDM), Data Management Plans (DMPs) and see our useful DMP Resources and Tools.

Working with Data: a Training Module

In collaboration with DPGS, a ‘Working with Data’ training module will begin on 11 May. The module will include an introductory session, after which training materials will be available through iKamva. The purpose of this training module is to equip postgraduate students with the basic knowledge and skills needed to clean and organize their data in spreadsheets and OpenRefine.

UWC Team wins Cluster Challenge

A UWC student team, the “Parallelizers”, were winners at the CHPC (Centre for High Performance Computing) 2021 Student Cluster Competition, and will go on to compete at the prestigious ISC 2022 Student Cluster Competition later this year. The CHPC Student Cluster Competition gives undergraduate students at South African universities exposure to the High Performance Computing (HPC) world. 

Team members Ruchelle Coetzee, Rofhiwa Matumba, Randall Buckton and Jaco Ferreira are all undergraduate Computer Science students at UWC. The team will be joined by Vanessa Dimtcheva and Edward Ramashia (from the University of the Witwatersrand) to make up the Centre for High Performance Computing (CHPC)’s team competing at the ISC 2022 Student Cluster Competition in Hamburg, Germany in May/June 2022. 

Team mentor from SANBI, Peter van Heusden, provided access to training resources and assistance along the way. “SANBI has been committed to supporting the Student Cluster Competition since 2013, providing technical advice, space on our computing environment and mentorship. We are of course overjoyed at the success of Team Parallelizers!”

Jaco started forming the team in early 2021, and they started training together as soon as they had a full team. The first round of the competition started in June, and the Parallelizers qualified for the next round, which took place in November. Ruchelle, currently a third year BSc Computer Science student, says she was shocked when she found out they had won. Teams were unaware of each other’s progress, so “it was difficult to know if we were on the right track”. “I had known one thing about Linux going into this competition and that was the existence of it”, she jokes. She was aware that some of the other teams already had Linux experience, and had previously dominated the competition, so “there was a factor of intimidation added”. Jaco, a first year at the time of the competition, now in his second year of his BSc, also acknowledges how intimidated he was by the more experienced other teams. He says that this actually “helped us with a sense of competitiveness and allowed us to push that extra little bit”. 

Rofhiwa was motivated by the opportunity to run software and solve problems using very powerful hardware. “As a computer science student, the competition would also provide me with a  channel to exercise my computational skills outside of my course content in a very relevant and fast-growing field”. The competition was impressive, with more than forty teams from universities across South Africa and other African countries competing in the first round. The final round was between four teams.

However, they don’t believe that it was just being the lucky underdogs that made them winners – “what ultimately gave us the victory was our team communication”, says Jaco. By speaking openly with each other, “we were able to overcome the difficulties presented by the online environment that we had to work on and were able to understand each other’s strengths and weaknesses.” Randall (currently in his fourth and final year of the extended curriculum BSc Computer Science programme), agrees. “I believe our good team chemistry along with our commitment to this competition led to us winning the competition”. 

The CHPC SCC required teams to build small high performance computing clusters. They were given a selection of applications to optimise and run on their cluster to demonstrate their design’s performance. Each team was assigned a budget and a parts list (the hardware was provided) from which they designed their cluster, and the teams were judged on a combination of their benchmark results and their cluster design. Rofhiwa describes the process: “we were required to benchmark (test the running efficiency) of software that would put good use to the systems made available to us and to develop an understanding of the networking systems that enabled us to do so”. 

“We worked long, hard and smart to ensure we gave it our everything, especially in the final round, and came out on top” says Randall. What does this win mean for these students going forward? “Winning this competition has boosted my confidence in myself”, says Randall. “The fact that I started with little to no knowledge on how to navigate Linux’s terminal, how to compile things and run benchmarks, and then 7 months later using all of this to win the competition is a testament to myself of what I am capable of doing”. Ruchelle also feels excited about her future. Her knowledge of Linux and cluster computing has “grown exponentially” in the last year, and her interest in the world of high performance computing has been piqued. “This competition allowed me to learn about this new world and gain heaps of exposure by learning through experience”. 

“Winning has given me the confidence to pursue high performance computing as a career choice in the future”, says Rofhiwa. “I am very proud of what we have  been able to achieve as a team and as a group of friends”. 

The ISC 2022 Student Cluster Competition, co-organized by the HPC-AI Advisory Council and ISC Group, will take place during the ISC High Performance Conference in June. The competition will follow a hybrid model, with some teams participating on-site and others, like the CHPC team, virtually. Final submissions are expected mid-May, after which the team will be interviewed and present their findings. 

The ICS HPC (previously known as the International Supercomputing Conference) Student Cluster Competition will include “applications that address education and applied learning towards accelerating bioscience research and discovery”. The student teams will be tasked to test several applications that are used by scientists and researchers. The CHPC team will be competing amongst international peers, all showcasing their expertise in “a friendly yet spirited competition that fosters critical skills, professional relationships, competitive spirit and lifelong friendships”. South Africa has historically performed well in the competition, and although they are also juggling university work, the team has started preparing for the competition. “We are making progress by division of tasks and responsibilities”, says Ruchelle. 

The call for participation for the CHPC 2022 Student Cluster Competition will be distributed in the next few weeks and will be communicated here. 

