Job Description
Job Title: Synthetic ML Training Data Generation
Nvidia CSIT Cyber-AI hub intern in Synthetic ML Training Data Generation (up to a maximum of 15 hours per week for 20 weeks)
The successful candidate will be working as an intern with the Nvidia CSIT Cyber-AI Hub project team to assist the research on generating synthetic network and system logs data in use for ML training. This role involves assisting the creation of robotic-process automation-based tools that simulate the human/machine to machine interactions in an enterprise network. Using the tools created, generate network and machine data for ML training purposes.
Another main responsibility is to assist the research on using Large-Language Models to generate data that are comparable to the ones that are generated by the aforementioned tools, and indeed real logs and network data. This will require the research and creation of a data comparison tool.
QualificationsDegree in Computer Science, Cyber security or in a relevant field.
Have, or be about to obtain an artificial intelligence related postgraduate degree.
SkillsEssential criteria:
Advanced Python programming skills.
In depth knowledge in Artificial Intelligent/Machine Learning
Prior knowledge of generative-AI and Large-Language Models
Experience in data management for AI training.
Proficient Linux and Windows skills and experience in code management.
Desirable criteria:
Advanced understanding in networking and security best practices.
In depth knowledge on cyber security concepts, e.g. Cyber Kill Chain and Defense-in-depth.
Experience in network administration.
Working knowledge of network emulation
Knowledge in the MITRE ATT&CK framework and other threat modelling tools.
You can also use your social account to sign in. First you need to:
Accept Terms & Conditions and Privacy Policy.