WEBVTT 00:00.000 --> 00:12.000 So we have another new topic today within biology that we're covering. 00:12.000 --> 00:20.000 So we're going from pictures and recognising decisions between huskies and walls to behavioural ecology. 00:20.000 --> 00:26.000 And so please let me put your hands together for, yep, and shall I go? 00:26.000 --> 00:32.000 Yep, for nickel, simple arts, we'll be talking about movement, which is here this is a call for community. 00:32.000 --> 00:34.000 So do you have a look and get involved? 00:34.000 --> 00:36.000 Thank you. 00:36.000 --> 00:50.000 Can you hear me okay? 00:50.000 --> 00:51.000 Yep. 00:51.000 --> 00:52.000 Okay. 00:52.000 --> 00:53.000 Thanks for the introduction. 00:53.000 --> 01:00.000 I'm very excited to be here at the first, as I say, no, the first pair of musics and computational biology dev room. 01:00.000 --> 01:06.000 And it's a bit of a leap, because we're going from genomes and proteins, which were most of the talks to animals. 01:06.000 --> 01:12.000 So my talk is going to be a bit different, but I hope it will be enjoyable for most of you. 01:12.000 --> 01:19.000 So I will talk about Python Toolbook, we are developing, which is called movement, which is for analyzing animal motion tracking data. 01:19.000 --> 01:21.000 So first of all, a bit of information about myself. 01:21.000 --> 01:26.000 I'm a neuroscientist and research software engineer, based at the Sensbury Welcome Center in UCL. 01:26.000 --> 01:31.000 That's the building you saw at the title slide. I work in a team of research software engineers. 01:31.000 --> 01:35.000 We call ourselves a neuroinformatics unit, so just swap the bio for neuro. 01:35.000 --> 01:40.000 I'm also a fellow of the software system ability institute, which was mentioned in an early talk. 01:40.000 --> 01:45.000 And I'm the lead developer of movement, which is the most relevant affiliation I guess for today's talk. 01:45.000 --> 01:51.000 So I'm going to give you a lot of background, because I expect that many of you are new to this field. 01:51.000 --> 01:58.000 Before I jump into explaining the package, so the problem we are trying to solve is to essentially quantify behavior. 01:58.000 --> 02:02.000 And this is tricky because behavior is a very nebulous concept to define. 02:02.000 --> 02:10.000 But some definitions are useful, and I like this one from 1951, which defines it as the total movements made by the induct animal. 02:10.000 --> 02:14.000 And I like the world movement, because movement we can measure precisely. 02:14.000 --> 02:16.000 And how can we measure movement? 02:17.000 --> 02:21.000 First of all, we can use video cameras to fill the movement itself. 02:21.000 --> 02:24.000 That will be the topic of the talk today. 02:24.000 --> 02:30.000 We can also use inertial measurement units, which are sensors you can attach to the body, the teva accelerometers, and stuff like that. 02:30.000 --> 02:37.000 You can also use GPS-based biologing for tracking movement on large time scales and large special scales. 02:37.000 --> 02:39.000 But today we are only talking about video cameras. 02:40.000 --> 02:50.000 The problem with video cameras is that you somehow need to transform the image, which is the video frame, into motion, or the series of frames into motion. 02:50.000 --> 02:54.000 The traditional way of doing that, that maybe you are familiar with this motion capture. 02:54.000 --> 02:58.000 So in the entertainment industry in cinemas in animation, you would wear suits. 02:58.000 --> 03:00.000 They will put reflective markers on you. 03:00.000 --> 03:08.000 You will go into a special design room, which has cameras, I think 20 or 30 cameras all around, that film you from every possible angle. 03:08.000 --> 03:14.000 This camera is located reflective markers. You do a 30 reconstruction, and you get a 3D time series of motion. 03:14.000 --> 03:18.000 Amazing. High quality. The problem is this incredibly expensive. 03:18.000 --> 03:23.000 And it's a bit invasive because you have to wear the suit and the markers. 03:23.000 --> 03:27.000 If you're trying to do it in animals, that's not so easy and very, very costly. 03:27.000 --> 03:31.000 So animal researchers have been resorting to more local ways of dealing with the problem. 03:31.000 --> 03:33.000 You can see one of them in this paper. 03:33.000 --> 03:38.000 You can attach color markers to an animal or different body parts of an animal, like an ant, in this case. 03:38.000 --> 03:44.000 And then you can use traditional image filtering methods to identify this colors during your analysis. 03:44.000 --> 03:50.000 But this has changed because in the last 10 years or so we have computer vision, deep learning based computer vision. 03:50.000 --> 03:57.000 And the new kind of game of the name of the game is markerless motion tracking. 03:57.000 --> 04:02.000 The way this works is that like everything else nowadays, you train a neural network. 04:02.000 --> 04:06.000 To identify body parts. So basically you label some video frames by hand. 04:06.000 --> 04:11.000 You click on the nose on the shoulder or whatever you care about. You label hundreds of this frames usually. 04:11.000 --> 04:16.000 And you teach a neural network to recognize the same body parts in unseen frames. 04:16.000 --> 04:25.000 If you do this for the entire video and you train the model well, you can give it a video and it gives you this representation of the body parts overlaid on top of the video. 04:26.000 --> 04:29.000 There are many user friendly tools that to do that workflow. 04:29.000 --> 04:33.000 Probably the best known one is deep log cuts. 04:33.000 --> 04:38.000 But there are others like sleep and lighting posts and probably 10 others that do a very similar job. 04:38.000 --> 04:45.000 So this is actually quite cool because it has transformed the study of animal behavior because it has changed the scale. 04:45.000 --> 04:50.000 Before you have to click through every frame, now you can do it in a few days. 04:51.000 --> 04:56.000 That's why essentially animal behavior scientists are now falling in love with computer vision. 04:56.000 --> 05:01.000 And that has brought many different fields that were different together. 05:01.000 --> 05:10.000 Because now neuroscientists by mechanics people, zoology, conservation psychologists, vets, animal welfare people, livestock farming people. 05:10.000 --> 05:17.000 They're all interested in using these tools or they're already using them, which means that there is a convergence in the toolbox. 05:17.000 --> 05:22.000 So basically all of these fields are now starting to use markerless tracking. 05:22.000 --> 05:26.000 The problem is what happens afterwards? 05:26.000 --> 05:34.000 Because this advent of computer vision has kind of moved the bottleneck one step down the line. 05:34.000 --> 05:40.000 Because now we can produce this motion tracks very fast, but we are not actually interested in the motion tracks. 05:40.000 --> 05:42.000 We're actually interested in the behavior of the animal. 05:42.000 --> 05:49.000 So we want to essentially extract meaning or measure some meaningful methods of behavior based on tracking data. 05:49.000 --> 05:54.000 Some simple examples would be maybe you want to measure the speed of the animal or the orientation of the animal in space, 05:54.000 --> 06:00.000 which areas of the environment it has visited, how fight has traveled, things like that, or more complex things. 06:00.000 --> 06:03.000 Currently this is done in ad hoc scripts in every lab. 06:03.000 --> 06:08.000 So every piece is hidden or postdoc, writes a script in Python, MATLAB or R to compute speeds. 06:08.000 --> 06:14.000 There are probably 200 versions of Python functions for computing speed in only my institute. 06:14.000 --> 06:17.000 And that's not really great for the field. 06:17.000 --> 06:22.000 So there is a lack of standardized data formats and tools because all the computer vision toolbox, 06:22.000 --> 06:25.000 I mentioned they speed out data in different formats. 06:25.000 --> 06:30.000 We have lots of fragile in-house scripts that are not maintained beyond the product's conclusion. 06:30.000 --> 06:36.000 And we have piles of analyze data because we're producing motion tracks faster than we can analyze them. 06:36.000 --> 06:42.000 So this is precisely the problem that we set out to solve and we are solving it with movements. 06:42.000 --> 06:45.000 It's a Python toolbox, it's a Python package. 06:45.000 --> 06:48.000 You can find it on PIPI and on Call the Forge. 06:48.000 --> 06:51.000 So that's kind of the visual summary of it. 06:51.000 --> 06:52.000 What's it does? 06:52.000 --> 06:53.000 It loads data. 06:53.000 --> 06:58.000 It ingest data from a variety of animal tracking frameworks, including the most popular ones. 06:58.000 --> 07:01.000 And we are continuously adding support for more formats. 07:01.000 --> 07:04.000 Once the data is in Python, we have a unified representation of the data, 07:04.000 --> 07:08.000 as a multi-dimensional array using the Python library X array. 07:08.000 --> 07:12.000 And this means that it doesn't matter which animal species we are studying, how many animals we are studying, 07:12.000 --> 07:14.000 or how you track them. 07:14.000 --> 07:17.000 Once the data is in movement, it looks the same. 07:17.000 --> 07:23.000 This allows us then to implement a bunch of general purpose methods for filtering, visualizing, and analyzing the data. 07:23.000 --> 07:29.000 And we do the implementations in a way that's fully tested validated and documented. 07:29.000 --> 07:33.000 So what are the source of things you can currently do with our API. 07:33.000 --> 07:36.000 You can check out also the examples on our website. 07:36.000 --> 07:41.000 You can do that cleaning, which is very important because as we heard in the previous talk, models make mistakes all the time. 07:41.000 --> 07:46.000 And you want to use some characteristics to identify these mistakes and correct them post hoc. 07:46.000 --> 07:51.000 You can visualize the data, you have some plotting utilities for plotting the trajectory of the animal, 07:51.000 --> 07:55.000 or which areas of its environment is visited as a heat map. 07:56.000 --> 08:00.000 You can precisely quantify how much time the animal spent in each area of the environment. 08:00.000 --> 08:05.000 You can measure the orientation angle of any body part you are interested in, let's say, here at the head. 08:05.000 --> 08:07.000 You can also use it in specialized applications. 08:07.000 --> 08:12.000 You can do ppulometry, you can track points on the pupil and then measure, let's say, the people of Lovs, 08:12.000 --> 08:16.000 they as I move from left to right, and many, many more. 08:16.000 --> 08:20.000 We are also interested in kind of lowering the barrier of entry for users, 08:20.000 --> 08:26.000 so we are developing a graphical user interface as a plugin for the Python image you are in a parry. 08:26.000 --> 08:32.000 What you see here is, I've load the video of two mice in octagonal arena inside a parry. 08:32.000 --> 08:36.000 I can navigate and scroll through the video like you would do in a normal video player. 08:36.000 --> 08:41.000 But now importantly, you can use movement to essentially overlay the tracking data on top of the video. 08:41.000 --> 08:45.000 And you can do this from a variety of different software sources. 08:45.000 --> 08:50.000 You can set the file path and then the data appears in the parry as two layers. 08:50.000 --> 08:54.000 First, you get the points, which are the predicted key points of body parts. 08:54.000 --> 08:57.000 And you can display their labels to inspect them. 08:57.000 --> 09:02.000 But apart from points, we also have this other layer on top, which is called tracks. 09:02.000 --> 09:05.000 And you will see what that is in a moment. 09:05.000 --> 09:11.000 As I play the video, you see essentially this tail extending behind the animal, which is the position of the animal in the previous frames. 09:11.000 --> 09:14.000 And you can very easily play with the parameters of that. 09:14.000 --> 09:18.000 So you can make the tail longer or shorter to view more or less of that history. 09:18.000 --> 09:21.000 That's quite useful for identifying tracking errors. 09:21.000 --> 09:25.000 And you can also go the other way, so you can extend the head and see the future. 09:25.000 --> 09:30.000 So if you do it all the way, you see the entire trajectory of the animal in this experiment. 09:30.000 --> 09:32.000 How are we doing all of this? 09:32.000 --> 09:38.000 So we are following a completely community power to commit the driven development. 09:38.000 --> 09:42.000 We have a core of engineers, I'm one of them. 09:42.000 --> 09:47.000 But until we have 34 contributors on GitHub, from multiple research slabs around the world, 09:47.000 --> 09:55.000 we have over 300 much pull requests, so far, 80,000 downloads on Pipi, I'm going to forge. 09:55.000 --> 10:00.000 And we already have several external packages depending on movement using it as a dependency essentially. 10:00.000 --> 10:06.000 So the bigger picture here is that we want movement to become a core part of the analysis workflow for analyzing animal behavior. 10:06.000 --> 10:11.000 That takes you from motion tracking data to medical metrics of animal behavior. 10:11.000 --> 10:16.000 And we see movement as a key part of the scientific Python ecosystem. 10:16.000 --> 10:20.000 I like to say, like a psychic image for animal motion data essentially. 10:20.000 --> 10:24.000 But if you don't like Python, you prefer R and I know there are many of you in this room. 10:24.000 --> 10:29.000 There is a sister project in R developed by a collaborator of ours, Mikael. 10:29.000 --> 10:31.000 He calls it any movement. 10:31.000 --> 10:36.000 And essentially the same workflow, and we work together to standardize on other formats naming. 10:36.000 --> 10:42.000 So we kind of keep the development step by step and you get the same similar experience in Python and in R. 10:42.000 --> 10:44.000 That's all I want to say today. 10:44.000 --> 10:48.000 I would like to thank the other two core developers, Shankuan Lo and Sophia Miniano, who are not here today, 10:48.000 --> 10:50.000 and our supervisor Adam Tyson. 10:50.000 --> 10:57.000 Our host is in Toots and a collaborators, and of course the many users and contributors who have made movement. 10:58.000 --> 11:03.000 And I will also mention if you're interested in that project, check out the links. 11:03.000 --> 11:15.000 And also we are running a workshop in London next summer for people who want to learn how to analyze videos of animals with techniques like the ones I mentioned today, including with movement. 11:15.000 --> 11:20.000 The registration is open for another two weeks, so you have time to tell people to register for that. 11:20.000 --> 11:22.000 Thanks a lot for your attention. 11:23.000 --> 11:28.000 And a question. 11:28.000 --> 11:34.000 I also have some experiences that I've already set up here. 11:34.000 --> 11:35.000 No questions? 11:35.000 --> 11:36.000 I have one. 11:36.000 --> 11:37.000 I have one. 11:37.000 --> 11:38.000 I have one. 11:38.000 --> 11:40.000 Do you package it about movement? 11:40.000 --> 11:41.000 Do you want to be changed? 11:41.000 --> 11:43.000 Do you want to be managed to create a bigger gate? 11:43.000 --> 11:45.000 Or is it planned? 11:45.000 --> 11:48.000 Yes, I will do. 11:48.000 --> 11:54.000 So the question was that by the package is about movement, but what eventually you want to get at this behavior? 11:54.000 --> 11:57.000 And there is still a long way from let's say speed to actual behavior. 11:57.000 --> 12:05.000 What I didn't mention is the thing you sit here on the bottom, which is so called behavior segmentation, which means you take the continuous motion facts that 12:05.000 --> 12:09.000 and you segment them into identifiable behaviors that you a human would describe. 12:09.000 --> 12:17.000 This is not in scope of the project, but we are maybe working on a different project that covers this scope, because we want to keep this package quite lightweight and single purpose. 12:17.000 --> 12:32.000 Thank you. 12:32.000 --> 12:37.000 So we actually have a few more minutes still before. 12:37.000 --> 12:44.000 So I might actually ask a very quick question, Nico. 12:45.000 --> 12:47.000 It is trying to run off. 12:47.000 --> 12:56.000 So low level data formats between R and Python using x-ray are just having x-ray implementations. 12:56.000 --> 13:02.000 Yeah, we are actually discussing this with Miguel, who is the guy who developed the R package a few weeks ago. 13:02.000 --> 13:03.000 There are solutions. 13:03.000 --> 13:11.000 So one of them is x-rays native format is net CDF, which is actually just an h5 with a specification. 13:11.000 --> 13:18.000 And low-d h5 in R, the other solution we are talking about is also transforming the data during x-ray in a tabular format and say we can pass per k. 13:18.000 --> 13:21.000 So that would also be an option and we are working on that. 13:21.000 --> 13:30.000 So in the end we don't care about having one single format as long as we agree on the spec, we can save any kind of file we want. 13:30.000 --> 13:36.000 Okay, so thank Nico once again. 13:41.000 --> 13:43.000 Thank you.