We invite you to join the Greycroft Talent Network

Leverage our network to build your career.
Tell us about your professional DNA to get discovered by any company in our network with opportunities relevant to your career goals.

Video Captioning AI Data Trainer

Invisible Technologies

Invisible Technologies

Software Engineering, Data Science
Posted on Wednesday, April 10, 2024

Video Captioning AI Data Trainer - Fully Remote

Start a career in tech: Join the team that’s supporting the latest cutting-edge AI language models.

Website | Video Demo | Core Values

About Invisible:

Invisible Technologies is dedicated to fusing human creativity and intuition with cutting-edge technology to create a future rich in impact and meaning. We firmly believe that human involvement is essential in unlocking the full potential of AI, ensuring it is developed with greater accuracy, quality, safety, reliability, and fairness. This human-centric approach underpins our unique process orchestration engine, a platform that merges artificial and human intelligence with automation to remove operational bottlenecks and pave the way for growth and innovation in our clients.

The Business Context:

You already use AI in many ways—like deciding what products and services to order—and it may be most familiar to you as a chatbot, an avatar-maker, or a way to unlock your screen. But here’s what AI may be able to help the world with finding medical diagnoses, teaching you about scientific research, and calculating the complexities of any function.

But, like humans, algorithms are what they eat. They’re only as good as the rules they know and the data they’re trained on. We’re the team that helps model these behaviors.

The Work:

This work involves annotating video clips, including the focus, action, camera movements, and other details within the video. The videos do not contain sensitive content and are shot in various styles and forms, such as home videos, animations, and video games.

These annotations are used to help train new capabilities into foundation models and contribute to the future of visual media.

The Person:

The perfect person for this work is adept at observing and describing visual media in English. Though the role will work on the cutting edge of technology, neither a technology nor a visual media background is required. Instead, we are looking for those who are interested in using language to describe what they see. We are looking for someone who has the following abilities:

  • Keen at spotting written mistakes based on pre-set criteria
  • Advanced skills in grammar, syntax, and spelling
  • Ability to add information without changing the original structure
  • Great eye for detail
  • Technical writing
  • Capture subtlety and visual nuance with language
  • Adaptable to operational changes as well as instructional changes
  • Narrative & descriptive writing (of visuals & non-verbal actions)
  • Comfortable learning new software tools
  • Level of English: C2
  • Film industry knowledge - Preferred


This is an entry-level contractor role perfect for fast typists, writers of various kinds, and people keen to be at the forefront of visual storytelling.

Pay begins between $17 - $22.75 per hour for top applicants.