CIDR: Classroom Interaction Detection and Recognition System. Process first-person video recordings in a preschool classroom to detect interactions with teachers and peers, as well as decoding child-directed speech. As detailed in the paper "Automatized analysis of children’s exposure to child-directed speech in reschool settings: Validation and application." https://doi.org/10.1371/journal.pone.0242511
PROTOCOL:
- Extract audio from video as mono 16-bit PCM, 32 kHz using Audacity (install FFmpeg) or Matlab’s audioread function
- Manually extract video frames containing faces to build face collection. We employed VFC. Every person interacting with the focal child must appear at least once in the selected frames. Follow AWS recommendations at: https://docs.aws.amazon.com/rekognition/latest/dg/recommendations-facial-input-images.html
- Upload images for face collections, videos, and audio to AWS S3 buckets.
- Create Face collection using "AWSCollecCreate.py"
- Add each image to the face collection created in step 4) using "AWSCollecAdd.py"
- Explore detected faces in step 5) and assign the project’s ID to each face using "FaceCollectionIDCheck.m". Delete low-quality faces first using "AWSCollecDeleteFaces.py" and later using the commented section at the end of "FaceCollectionIDCheck.m"
- Use "AWSCollecListFaces.py" to generate final list and save as a json file
- Append project’s ID employed in the study to the List of faces from step 7) using "TrueIDAppend.m"
- Run "AWSGetFaceDetec.py", "AWSGetFaceSearch.py" and save json output in their corresponding folders
- Check identification from FaceSearch using "IdentificationCheck.m"
- Process output from FaceDetect and FaceSearch using "VideoFeatureExtraction.m"
- Obtain audio transcription using "AWSTranscribe.py".
- Process transcripts and audio using "AudioFeatureExtraction.m"
- Train classifier using “TrainClassifier.m”
- Deploy classifier using “DeployClassifier.m”