![Page 1: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/1.jpg)
City-Identification of Flickr videos using semantic acoustic features
Benjamin Elizalde - Carnegie Mellon University
![Page 2: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/2.jpg)
Outline
1. Task2. Approach3. Experiments4. Results5. Conclusion
![Page 3: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/3.jpg)
City-identification of videos
● Aims to determine the likelihood of a video belonging to a set of cities.
● Our approach focuses only on the audio track.
![Page 4: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/4.jpg)
Outline
1. Task2. Approach3. Experiments4. Results5. Conclusion
![Page 5: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/5.jpg)
Approach to City-identification of videos
● Expresses the relationship between a taxonomy of urban sounds and the city-soundtracks.
● Computes and used semantic acoustic features to show evidence of the relationship.
● Contrasts to only using frequency analysis of the city-soundtrack.
![Page 6: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/6.jpg)
Our sounds and cities
● The 10 urban sounds: ○ air conditioner, car horn, children playing, dog bark, engine idling,
gun-shot, jackhammer, siren, drilling, and street music.
● The 18 cities consists of : ○ Bangkok, Barcelona, Beijing, Berlin, Chicago, Houston, London,
Los Angeles, Moscow, New York, Paris, Prague, Rio, Rome, San Francisco, Seoul, Sydney, Tokyo.
![Page 7: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/7.jpg)
A combination of sounds to approximate the city-soundtrack
![Page 8: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/8.jpg)
A combination of sounds to approximate the city-soundtrack
● The linear combination and the weight matrix can be used as the acoustic features.
![Page 9: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/9.jpg)
A combination of sounds to approximate the city-soundtrack
● The linear combination and the weight matrix can be used as the acoustic features.
● The weight matrix carries the semantic evidence, indicating the presence of a given sound in a city-soundtrack.
![Page 10: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/10.jpg)
A combination of sounds to approximate the city soundtrack
● The linear combination and the weight matrix can be used as the acoustic features.
● The weight matrix carries the semantic evidence, indicating the presence of a given sound in a city-soundtrack.
● Successful examples of sound retrieval were achieved using the weight matrix i.e. sirens in a Berlin video.
![Page 11: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/11.jpg)
Outline
1. Task2. Approach3. Experiments4. Results5. Conclusion
![Page 12: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/12.jpg)
End-to-end pipeline for city-identification
![Page 13: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/13.jpg)
Outline
1. Task2. Approach3. Experiments4. Results5. Conclusion
![Page 14: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/14.jpg)
Our approach outperforms the state-of-the-art
*Statistical Features are statistics derived from MFCCs, such as mean, variance, kurtosis, etc.
![Page 15: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/15.jpg)
More bases help and extend the semantic evidence
![Page 17: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/17.jpg)
Outline
1. Task2. Approach3. Experiments4. Results5. Conclusion
![Page 18: City-Identification of Flickr videos using semantic ... · City-Identification of Flickr videos using semantic acoustic features Benjamin Elizalde - Carnegie Mellon University. Outline](https://reader035.vdocument.in/reader035/viewer/2022071012/5fca6ca3c6c44b6b775be46b/html5/thumbnails/18.jpg)
Audio can help city-identification of videos
1. City soundscapes contain information that aids its identification and geolocation.
2. Our method not only aids city-identification but also provides evidence.
3. More bases/sounds could improve our results and extend our evidence.