Computer Vision Workshop Madrid

Madrid Computer Vision Workshop #

This 2-day intermediate workshop will cover how to apply computer vision technologies to investigatory research. It will provide an introduction, theoretical background, survey of trends and capabilities, exploration of training datasets, and show how to build a computer vision application. There are no prerequisites, but some familiarity with coding is recommended or participants can partner with each other for the technical sections. The workshop is based on research and code developed for the Exposing.ai and VFRAME.io computer vision projects, but the workshop covers a wide of computer vision technologies and topics. Python code examples will be provided in Jupyter notebooks that can be run locally.

Instructor: Adam Harvey / https://adam.harvey.studio

Custom AO-2.5RT cluster munition object detection algorithm developed for VFRAME

Custom AO-2.5RT cluster munition object detection algorithm developed for VFRAME

Instructor Bio
Adam Harvey is a computer vision researcher, software developer, and the founder of the VFRAME.io project. He developed what could be considered the first well known computer vision hack in 2010 by reverse engineering face detection algorithms to create computer vision camouflage. In 2017 Harvey started VFRAME to help bridge the gap between commercial computer vision technologies and their application to human rights research. He also runs the Exposing.ai research project that investigates the origins and endpoints of training data used in face recognition systems. His computer vision research has appeared in the New York Times, Financial Times, Wall Street Journal, and received an Award of Distinction from Ars Electronica.

Day 1 #

Session 1A: Introduction and Theory (11:00 - 13:00)


(Lunch break 13:00 - 14.00)


Session 1B: Coputer Vision Now (14:00 - 16:00)

  • Commercial and open source computer vision:
    • Amazon Rekognition: image and video analysis API
    • Azure: paid computer vision APIs
    • Azure face recognition demo: Azure face recognition demo and service
    • Google OCR: document OCR demo and paid srvice
    • PimEyes: face recognition demo and paid service, popular with journalists
    • FindClone: face recognition servcie (RU), requires registration, recommeed by Bellingcat
    • TinEye: reverse image lookup, useful to find origin of an image
    • InVID: reverse image lookup (via Google, Bing, etc…)
    • For discussion:
      • What are other commercial computer vision demos or services?
      • What are advantages/disadvantages of commercial API services?
      • Are the capabilities useful for any project you have in mind?
      • What would you be willing to pay for these services?
      • What happens when you’re working on sensitive material?
      • Are there any open source versions of these products?
  • Open-source computer vision projects and libraries:
    • OpenCV: The most widely used computer vision backend library for >≈ 20 years. Though it’s struggling to keep up with DNN libraries and GPU (NB: OpenCV uses BGR, not RGB)
    • Pillow: An easy to use and effecient library for working with images, changing contrast, resizing, display
    • Numpy: Data matrix library, since images are matrices, Numpy is widely used for image processing and is highly performant
    • Pandas: popular data management, analysis library, like Excel but for Python (can also write Excel documents), uses CSV data format
    • ONNX: Open Neural Network Exchange, a cross-platform model format that can be used in many different frameworks. Not as effecient as some platform-specific formats (eg .pt, .pb, .coreml) but more portable
    • YOLOV5: popular object detection library. There are many versions of YOLO, this library is updated often and has an active community of contributors (used by VFRAME)
  • Machine Learning Frameworks:
  • Models (aka ModelZoos):
    • ModelZoo.ca: lists popular models from GitHub
    • Modzy: MLOps with overlap to defense contractors
    • RunwayML: GUI tool for running many ML models
    • ONNX: common moels for ONNX runtime
    • HuggingFace: place to run CV/ML demos
  • CV tutorials:
    • PyImageSearch: informative tutorials on wide range of CV topics. Most are free, paid access for more tutorials (recommended)
    • LearnOpenCV: informative tutorials on wide range of CV topics, also offers paid courses (recommended)
  • Algorithms:
  • Coffee break and discussions

Session 1C: Datasets (16:30 - 18:00)

Session 1D: Annotations (19:00 - 20:00)

  • Installing Conda locally:
  • Tools:
    • LabelImg: popular tool for image labeling on local machine (recommended)
    • VGG Via: easy to start with browser based image annotation
    • CVAT: Intel’s open source annotation tool, too complex for simple projects, but popular for larger projects
    • RoboFlow: one of many paid image annotation tools
  • Assignment: what kind of dataset would you want to create?

Day 2 #

  • Session 2A: How to build an Object Detection Algorithm Part 1 (11:00 - 13:00)
    • Continue working on annotations
    • Convert annotations to unique name to avoid collisions
    • Upload/share files
    • Setup cv-workshop https://github.com/adamhrv/cv-workshop/
    • Try running notebooks
    • Try running yolo
  • Session 2B: How to build an Object Detection Algorithm Part 2 (14:00 - 16:00)
    • Upload your YOLO files here
      • create a subfolder with your name (eg adam)
      • then upload your folders with .jpg and .txt files
      • should look like adam/1234abcd/1234abcd_00001.jpg, adam/1234abcd/1234abcd_00001.txt etc…
    • Cloud GPU services
    • Verifying data
  • Session 2C: Face Recognition Demo (16:30 - 18:00)
    • DeepFace, InsightFace, notebooks
  • Session 2D: VFRAME demo (19:00 - 20:00)
    • Discussions
    • Try to get vrame running
    • Test our model