"

Foundational Knowledge

Introduction

Audio and video are powerful tools for teaching, learning, and communication. They can bring concepts to life, demonstrate real-world examples, and make content more engaging. However, without careful attention to accessibility, multimedia can also create barriers for learners with disabilities. Multimedia is more than just video. Multimedia encompasses audio, voice-over slide presentations, screencasting, video, and immersive videos (Crawford, 2025).

Accessibility in multimedia ensures that everyone, including individuals who are Deaf, hard of hearing, blind, have low vision, or process information differently, can fully participate and benefit from the content. Providing clear audio, thoughtful video design, accurate captions, and comprehensive transcripts not only supports learners with disabilities but also benefits all users. For example, captions help non-native speakers, transcripts allow quick searching and studying, and high-quality recordings improve the learning experience for everyone. Accessible multimedia ensures that:

  • Everyone can access the information regardless of disability.
  • Content is perceivable in multiple formats (visual and auditory).
  • Institutions meet accessibility and compliance requirements (such as WCAG 2.1 guidelines).

In this section, you will learn best practices for:

  • Capturing high-quality audio
  • Designing accessible videos
  • Providing captions and transcripts
  • Understanding how captions and transcripts support accessibility

By applying these practices, you will make your multimedia content more inclusive, usable, and effective for a wide range of learners.

Capturing High-Quality Audio

Clear, high-quality audio is essential—not just for the listening experience, but also for the accuracy of captions and transcripts.

Best practices for audio quality:

  • Quiet environment: Record in a location with minimal background noise.
  • Microphone placement: Use an external microphone when possible. With any microphone (built-in or external), position it close to the speaker’s mouth without distortion to minimize background noise. According to Crawford (2022), if you are using an external microphone, make sure to use the right kind of microphone. He offers three options:
    1. Unidirectional – these microphones capture audio from one direction. They aid in the minimization of background noise and are best for a single person recording.
    2. Bidirectional – these microphones capture audio from two directions. They are especially useful for interviews when placed between two people.
    3. Omnidirectional  – these microphones capture audio from all directions. These microphones are typically found on most laptops.
  • Consistent volume: Speak at a steady pace and volume.
  • Avoid overlapping voices: If multiple speakers are present, take turns speaking clearly.
  • Prepare for the recording: According to Crawford (2022), ensure that you think about and know what you want to say before hitting the record button, even for spontaneous comments.
  • Allow for space: Crawford (2022) urges us to wait three seconds after pressing the record button and speaking. Additionally, doing the same at the end of the recording to ensure none of the recording is cut off.

Video Design Considerations

Videos should be designed so that learners can both see and hear key information, or access it through alternative formats. Additionally, your content needs to be organized in a way that will help your students learn the content (Crawford, 2019). Make sure you are prepared and know what you are going to say before you start recording. Taking notes or writing a script can be very helpful.

Tips for accessible video design:

  • Quality Audio: Make sure you are using a good microphone and are speaking into the microphone as close as you can (see the above section on Audio)
  • Visible speakers: Studies are divided on whether or not being on camera when creating videos is effective. Some have shown that having a visible speaker and hence a social presence in your videos can aid your students in seeing your social cues (Kizilecec et al., 2015). It seems that your presence might lead to an increased student satisfaction (Wang & Antonenko, 2017). If you choose to be on video, ensure your face is well-lit for those who lip-read. Lighting should be in front of the speaker and not behind. Also, consider your camera angle. Your face should be leveled with the camera lens. Be aware of your position as well. Do you want just your face to show? Your whole body? Either way, when recording, make sure you are looking at the camera.
  • Avoid flashing or strobe effects: Many of your students might find flashing effects to be distracting and disturbing. For some students, these can trigger seizures or cause discomfort. This does not exclude inserting images in your video. Images can aid students in comprehending the content.
  • On-screen text and visuals: Use large, high-contrast fonts and ensure graphics are described verbally. Make sure to only use “keywords from the spoken portion of your presentation.” (Crawford, 2025, p. 8)
  • Pacing: Allow time for learners to process information before moving on. Pause when speaking and allow students to process what they just watched. Don’t overload them with content. Instead, chunk your content so your students can process the information. This ensures the students will remember what they are to learn.
  • Length of video: There are many different opinions on how long a video should be. Remember, your students can only process so much at a time. Studies have suggested anything from three to four minutes to under 15 minutes (Crawford, 2025). A study by Afify (2020) showed that students performed better when the videos were shorter (see more details in Crawford’s 2025 article). Remember, the important part is your content and what you are trying to communicate to your students.

Providing Captions and Transcripts

Captions and transcripts are essential accessibility features for audio and video content. They not only support learners with disabilities but also benefit everyone, for example, those in noisy or quiet environments, or non-native speakers.

About Captions

What are captions?

Captions are text displayed on-screen that provide a written version of the spoken dialogue and relevant sounds.

How do captions help with accessibility?

  • Provide access for deaf and hard-of-hearing individuals.
  • Aid learners who process information better visually.
  • Support comprehension for English language learners.
  • Allow for use in sound-sensitive environments (quiet libraries or noisy commutes).

What to include in captions

  • All spoken words, including dialogue and narration.
  • Identifiers of who is speaking when it’s not clear.
  • Relevant sound effects (e.g., [door creaks], [laughter], [music fades]).
  • Accurate spelling, grammar, and punctuation.
  • Description of what is on the screen, if the narration is not providing that description.

Why Auto-Captions Are Not Reliable

Many video platforms, such as YouTube, now offer automatic captioning features. While this technology can be a helpful starting point, it should never be considered a final solution for accessibility. Auto-captions are generated by speech recognition software, which often introduces errors that reduce accuracy and usability.

Parton (2016) reviewed course videos that were auto-captioned by YouTube’s system. A total of 525 errors (7.7 errors per minute) were found. It is vital to check the auto-captions when using them and editing them accordingly.

Limitations of auto-captions:

  • Accuracy issues: Names, technical terms, and specialized vocabulary are frequently mistranscribed.
  • Missed context: Auto-captions often fail to capture tone, emphasis, or speaker identification.
  • Sound effects omitted: Non-speech sounds (e.g., [applause], [music playing]) are usually ignored.
  • Accents and clarity: Speakers with strong accents, fast speech, or background noise can result in high error rates.
  • Grammar and punctuation: Auto-captions often lack proper sentence structure, making them harder to read and follow.

Why this matters for accessibility

For captions to be truly accessible, they must be accurate, complete, and easy to understand. Auto-captions that contain errors can confuse or mislead learners, especially those who rely on captions as their primary means of accessing content.

Remember: Always review and edit auto-captions for accuracy before publishing.

In Crawford’s (2021) publication, the author alerts us to the Described and Captioned Media Program (DCMP) and its recommendations, which are that captions should be:

  • Synchronized and appear at approximately the same time as the audio is available
  • Verbatim when time allows, or as close as possible
  • Equivalent and equal in content
  • Accessible and readily available to those who need or want them

Additionally, the DCMP recommends the following principles when possible (see Crawford (2021)):

  • Captions appear on-screen long enough to be read
  • Limit on-screen captions to no more than two lines
  • Speakers should be identified when more than one person is on-screen or when the speaker is not visible
  • Punctuation is used to clarify meaning
  • Spelling is correct throughout the production
  • Sound effects are written when they add to understanding
  • All actual words are captioned, regardless of language or dialect
  • Use of slang and accent is preserved and identified

Speech-to-text systems, like the one YouTube utilizes, are a great start to our auto-captions. Relying solely on auto-captions risks excluding learners, though, instead of including them. Taking the time to ensure captions are accurate and meaningful demonstrates a true commitment to accessibility.

Audio Transcripts

What are transcripts?

Transcripts are written versions of spoken audio, provided as a separate text file or page. Unlike captions, they do not appear on-screen during playback but are available for learners to read independently.

Benefits of transcripts

  • Provide full access to audio content for deaf or hard-of-hearing users.
  • Allow learners to quickly search, review, or study content.
  • Can be used by screen readers for blind or low-vision learners.
  • Helpful for learners who prefer text-based study materials.

Best practices for transcripts

  • Include all spoken dialogue and relevant non-speech audio information.
  • Identify speakers clearly when needed.
  • Make transcripts easy to find—link them near the video or audio file.
  • Format with headings, bullet points, or time markers if the content is long.

Key Takeaways

  • Multimedia must be designed so all learners can access it equally.
  • High-quality audio improves clarity, comprehension, and caption accuracy.
  • Accessible videos consider both visuals and audio cues.
  • Captions and transcripts are critical accessibility tools that make content inclusive, searchable, and usable in varied contexts.

Summary

Multimedia has the power to engage, explain, and inspire, but only when everyone can access it. By incorporating accessibility practices into your audio and video content, you ensure that learners with diverse needs have equal opportunities to participate and succeed.

Key points to remember:

  • High-quality audio improves clarity and supports accurate captions and transcripts.
  • Accessible video design ensures information is both visible and understandable.
  • Captions provide real-time, on-screen text for dialogue and important sounds.
  • Transcripts offer a complete written record that is searchable, readable, and usable in different formats.

When you provide captions and transcripts, you are not just meeting accessibility requirements—you are creating content that benefits everyone. Accessible multimedia enhances learning, supports multiple ways of engaging with content, and helps you meet compliance standards such as WCAG.

Next steps for you:

  • Review your existing multimedia to see if captions and transcripts are included.
  • Apply best practices for audio and video in future projects.
  • Remember: Accessibility is not an add-on; it’s a core part of effective and inclusive teaching.

By making accessibility a priority, you make your content stronger, more inclusive, and more impactful.

Resources

Afify, M. K. (2020). Effect of interactive video length within e-learning environments on cognitive load, cognitive achievement and retention of learning. Turkish Online Journal of Distance Education, 21(4), 68–89.

Crawford, S.R. (2025). Beyond Video: Harnessing Multimedia for Engaging Online Courses [White Paper]. Quality Matters.

Crawford, S.R. (2022, March 15). The role of audio presentations. Quality Matters.

Crawford, S. R. (2021, February 15). Captions help ALL learners. Quality Matters.

Crawford, S.R. (2019, December 18). Designing multimedia presentations for your course. Quality Matters.

Kizilcec, R. F., Bailenson, J. N., & Gomez, C. J. (2015). The instructor’s face in video instruction: Evidence from two large-scale field studies. Journal of Educational Psychology, 107(3), 724–739.

Parton, B. (2016). Video captions for online courses: Do YouTube’s auto-generated captions meet deaf students’ needs?. Journal of Open, Flexible, and Distance Learning, 20(1).

Wang, J., Antonenko, P., & Dawson, K. (2020). Does visual attention to the instructor in online video affect learning and learner perceptions? An eye-tracking analysis. Computers & Education, 146.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

MCCCD Accessibility Micro Developments Copyright © by Carla Ghanem; Deborah Baker; Rob Morales; and Stephanie Williams is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.