Well i have a question. I have some data where only the one of the two eyes had been filmped. Is it possible to get the gaze data out of that?
I fear not. The Pupil Invisible gaze pipeline requires a binocular pair of eye images as input.
ai oke
Hi @papr, No I'm very clear that you're inheriting timestamps from the nearest eye camera frame in the gaze data. This is WHY it should not be possible to have 4ms or 8ms intersample temporal distance in the gaze data, and that's a separate issue to the eye camera sample rate actually being 250Hz, not 200Hz, and all the other things. I was able to record eye data from one eye - an artificial eye with no movement whatsoever. I also found in a thread here that if one eye is lost, the gaze data from the other eye is reported without averaging, and we do get some gaze data even though I'm not sure what the spatial coordinates are in this context. So - I'll post the results of taking spatial precision measures later - once we've converted from pixels to degrees, but the temporal precision/sampling frequency is clearly still highly variable even when the 1euro adaptive algorithm is not adapting - as there is no velocity. So - that seems to rule out the filter as the main cause of the timestamp strangeness, at least. See attached.
There is an update regarding the measured isis. It comes down to the software using the software timestamps (time of reception) and how the USB data transmission scheduling works in detail under the hood. I am still working on improving my understanding of these details. Once I feel comfortable/confident explaining it, I will let you know. I will also try to estimate the error introduced due to the usb transmission scheduling.
What I can tell you is that the camera is definitively running at 200 Hz and not at 250 Hz.
seems to rule out the filter as the main cause of the timestamp strangeness The filter does not touch the timestamps. Therefore, it was out of question for me that it could be related to it. The only effect it could have is a possible phase-shift in the gaze signal. But that would not be visible in the timestamps.
note the standard deviation of 4.209ms is the measure of temporal precision. For spatial precision we'll use SD of intersample distance and the sample to sample RMS. Since we use an artificial eye with no movement, this spatial imprecision should represent system noise - without any biological component - but this assumes some things the the black box right now - would be great to confirm.
I also found in a thread here that if one eye is lost, the gaze data from the other eye is reported without averaging Pupil Invisible does not report gaze data for the two eyes separately. What you are talking about is the Pupil Core gaze estimation. Please do not mix them conceptually!
If the 1euro filter is adapting, this implies a higher processing load which would effect timestamping, according to your description of the pipeline? If the eye camera is sampling at 200Hz, how come there are no isi's of 5ms in the data, which would be the same rate? It's all multiples of 4ms. From my perspective, the precision measures are from this tracker - the software producing them is of less interest to the user - so in all cases I'm talking about the data quality it outputs rather than any comparison of pipeline stages/processing software. We were able to get gaze data from one eye recorded - the artificial eye we use is fairly elaborate and designed to trick ophthalmology imaging equipment - so it works on all trackers we've tried it on - so far 21 different trackers all recognise it as an eye, as it can also produce a bright pupil reflection with on-axis illumination, and all 4 purkinje reflections etc. This is apart from the fact that IF the gaze timestamps are inherited from the nearest eye camera frames that are valid, then since the gaze data / world camera frames are at 66Hz, no two gaze samples should have a 4ms isi unless only the last eye camera frame of that world camera frame period is valid, and the next valid frame is the next eye camera frame, with the two next frames also invalid - i.e. eye camera frames with invalid=0 and valid=1 would be something like |001|100| during two world camera frame periods.
1euro filter is adapting, this implies a higher processing load which would effect timestamping, The frame timestaming happens in P1-3, while the 1eur filter is applied in P4, after the timestamping has been applied already. Also, the "processing load" is negligible in comparison to everything else.
how come there are no isi's of 5ms If I understood it correctly, there is some kind traffic of traffic congestion that causes some frames to arrive more quickly once the congestion is resolved. But this is the part which I still need to understand in more detail.
the precision measures are from this tracker - the software producing them is of less interest to the user I highly disagree with you on this. Most eye tracking hardware is just a set of cameras. The software processing the recorded images is where the Core of eye tracking lies. Also, only understanding the full pipelines in detail will result in the full picture.
Both, Pupil Core pupil detection and Pupil Invisible gaze estimation pipelines work on a best-effort basis. They will take any image as input and try to detect the pupil/estimate the gaze direction. If you do not input the image of an eye, the result will be of questionable quality of course.
world camera frames are at 66Hz This is incorrect. The world camera operates at ~30 Hz.
Could you share an example eye video recording your artificial eye. I am curious how that looks like from the eye cameras perspective.
OK - so if the world camera operates at 30Hz, then it would be |XXXXX1|100000| during two world camera frames. I highly disagree that the hardware is not relevant here - there's a big difference between commercial trackers even in terms of temporal precision across the same sample rates, for example, based on whether the timestamping is hardware driven, as in the LC Technologies trackers, or software driven, as you have here. Also, there's a trade off between eye camera resolution and spatial accuracy and precision, and many other things!
Apologies, I did not mean to discredit the hardware. Of course, the hardware plays an important role. I just wanted to say that the software is important, too. š
if the world camera operates at 30Hz, then it would be |XXXXX1|100000| during two world camera frames I do not understand this notation and what issue it relates to.
Hi @papr, @wrp, FYI @user-bbf437; Is there any clarity on the 4, 8, 12ms etc. intersample times in gaze data timestamps at 30Hz inherited from eye camera at either 200Hz or 250Hz? Need to know if the potential explanation offered above might explain, or if there is way to clarify why we would have all those modes of intersample temporal distance in the 30Hz gaze timestamps? Could it be buffer issues and actually 250Hz or is there another explanation? Thanks a lot!
Well - it is an attempt to say that the only possible way to have inter sample times of 4ms inherited from a 250Hz eye camera in 30Hz world camera frames that select 'most recent' gaze sample, is if world-camera-frame-based gaze data has the most recently sampled eye camera image's timestamp, and all but the first eye camera sample of the next world camera frame is lost. X means doesn't matter whether sample is lost or not, 1 means valid sample, and 0 means sample/eye camera frame from which no gaze could be calculated - invalid/lost. Logically, 'most recent' becomes the first eye camera sample recorded after the previous world camera frame, that becomes the most recent even though it's actually ~20-24ms ago, if it's 250Hz, and ~25-30ms ago if it's 200Hz.
Of course, if the eye camera is definitively sampling at 200Hz but producing intersample intervals of 4ms, then that's a very different, and potentially much bigger problem.
....possibly with the time-space continuum? š
PS - this: how come there are no isi's of 5ms "If I understood it correctly, there is some kind traffic of traffic congestion that causes some frames to arrive more quickly once the congestion is resolved. But this is the part which I still need to understand in more detail." ....is what I meant by buffer issues yesterday, but it would not produce 4ms apart samples regularly - with no samples that are 5ms apart/at the sample/eye camera frame rate. Thanks a lot and I look forward to your answers - this is a great system, we'll try to max out on possible measures from it once we have a handle on the data quality constraints. I'll try to get you the eye video too, from the artificial eye recordings. Can you give me a quick pixel-to-degrees conversion ratio for the gaze data, or provide the necessary parameters to calculate?
https://github.com/pupil-labs/pupil/blob/91b77327f1d9cb79454efd6a1fd6a7234f4420bc/pupil_src/shared_modules/camera_models.py#L140 you can find prerecorded camera intrinsics here. You can find more accurate values for your specific scene camera via the Pupil cloud api. I will look up the api endpoint and come back to you
Edit: This is the endpoint. You will need to use an api token generated from your settings page for authentification. https://api.cloud.pupil-labs.com/hardware/<serial number>/calibration.v1?json
Thanks! Is there a way to bypass the cloud and use the software locally? We can't upload experimental data to cloud - even for a second - without breaking data handling requirements for human research.
You do not need to upload recordings to use the api and access the intrinsics. You can use intrinsics locally when post-processing the recorded data with your own code. This is how you can access the recordings without uploading them to Pupil Cloud https://docs.pupil-labs.com/invisible/user-guide/invisible-companion-app/#recording-transfer
Ok, thanks!
Hello! I would like to ask a question about using the Pupil Invisible Glasses which we have available for a research project. The project is about an assistive mobile app for the visually impaired. We would like to only use the Scene Camera of the glasses directly from our app to take a screenshot and/or video feed and get the data real-time to our app. Looking at the documentation the following questions arose: -Firstly, is the above desired functionality or a similar one available using the Pupil Invisible Glasses connected to an android device? (if iOS is also supported it is also very welcome) -If this is possible, how should we proceed to develop this functionality with the available Pupil tools? (E.g. just treat it like an external camera?, use a driver?) Thank you very much for your time!
Hi @user-28da12! This would be possible using Pupil Invisible, but depending on the details it might be more or less difficult! All the data generated on the device (including the scene video) is published to the local network through the phone. So any computer in the same wifi, or another app on the same phone can access all the data in real-time. Receiving the data on a computer that supports Python is very easy, see the docs here for details: https://docs.pupil-labs.com/developer/invisible/#network-api Receiving the data in an Android app is difficult though, since you would need to implement a custom NDSI client for Andoroid, which is not easy. We are however currently working on updating our real-time streaming protocol to RTSP, which would be very easy to receive also on Android. The release of that update is expected this fall.
Note the following additional limitations: - The PI companion app is only compatible with one specific phone model, the OnePlus8. This is mostly because the gaze pipeline is only compatible with this model, which does not seem to be relevant for you, but still the app is set to be not installable on other models. - Further, even though you are not using it, the gaze pipeline will still run in the background limiting the battery life to ~180 minutes.
Alternatively, you could access the scene camera directly from within your own app. The cameras can be interfaced with UVC and there exist 3rd party apps already that can do that. We would however not be able to support you in that development.
Thank you very much for the detailed and comprehensible answer! I will further discuss and explore the above options with my team
@user-bbf437 are you able to upload to cloud if no scene video is uploaded for example?
"Densifying" gaze data post hoc in cloud is re running eye videos through our DL gaze pipeline. While we would love to be able to open source everything, this pipeline is closed source and therefore only available in cloud (not desktop).
Thanks @wrp, am I wrong to think that it only needs the two eye camera videos to work? We may still get into trouble for uploading these too, I will check with my supervisors to be sure. Would the DL gaze pipeline also be available as a standalone, downloadable software?
Both eye videos are required, correct. Gaze pipeline is not available as desktop SW.
Hi there, quick question: is there any problem with the cloud system? Iām trying to upload a couple recordings but this is what I get after more than 20 minutes...
Hi @user-aaa87b! Please try the following to fix the issue:
1) Make sure your app version is up to date. 2) Log out of the app and log back in. 3) In case that has not yet fixed the problem delete the app's cache. Do this by holding the app icon -> info -> storage -> clear data. This deletion will not delete any recording data but only internal app data.
Let me know in case the issue persists!
It worked! Thanks a lot.
Hello Pupil Labs team, I'm trying to get IMU data with this code : https://docs.pupil-labs.com/developer/invisible/#network-api. I added "imu" to SENSOR_TYPES but I dont find the command to get the data. Could help me pls ?
Hi @user-59adb5. Once you have added "imu" to sensor types, you can add a few lines of code in the 'Fetch recent data' section starting at line 43:
elif sensor.name == "BMI160 IMU":
ts, accel_x, accel_y, accel_z, gyro_x, gyro_y, gyro_z = data
Thanks alot !
Hi @wrp, @user-bbf437, Since this is academic research involving human data, we can't have any access whatsoever for any period of time whatsoever, outside our double-layer protected and fully specified machines. Every person who could possibly access any part of the data for any period of time has to be named and approved, I guess this will not be an unusual situation for researchers. Is there no standalone app that could be used for research data? We wouldn't necessarily need open source, an app that runs offline would be fine?
You can use Pupil Player to open, playback, and export raw data to csv for individual Invisible recordings. However, gaze pipeline is only available on PI Companion (@ ~66Hz with OnePlus8) and Pupil Cloud (post-hoc densification at 200Hz).
There are other academic research institutions using Pupil Invisible + Pupil Cloud. No human will access your recording data in Cloud unless requested to do so by you (e.g. in the case of error for example) and if this is the case all chain of access will be logged and can be shared with you by request.
The eye video cameras run definitively at 200 Hz. We verified this by looking at the hardware timestamps provided by the camera. Unfortunately, these are subject to drift which is why using software timestamps is more accurate on average than using hardware timestamps.
Software timestamps are measured as soon as the frame data arrives on the USB host (Companion device). Unfortunately, the USB transmission is not continuous but subject to a schedule with limited bandwidth. This schedule has a slightly different frequency than the camera sampling. The different modes in the software-timestamp-based isi distribution are an effect from the interference of these two different frequencies.
OK thanks, so why does the gaze data have multiples of 4ms only as intersample times, then? Could understand if lag caused larger than 5ms intersample time at 200Hz, but not shorter times...? Also would really appreciate a response to the questions above regarding what happens when there's eye camera loss and whether this would explain the variance in intersample times etc., beyond the fact that they are in multiples of 4ms only, with no multiples of 5ms/200Hz present. Thanks a lot, when we know this we can think about how to handle the data for our purposes and for valid fixation/saccade detection, or at least to know what effect size could be detected for cognitive measures.
...note also that the standard deviation in intersample gaze data is 4.209ms. This should not be the case, then?
This is a very high-level description of how I understood the problem. The actual the details are subject to how USB transfer works on the lowest level and I personally do not comfortable in claiming that I understood these details. Nonetheless:
The USB bandwidth is sufficient for transferring eye video data at 200 Hz on average. But this bandwidth is not available all the time. Instead, at some time points there is no transmission, and at others there is more bandwidth available than necessary. The camera always tries to transfer the frame data as quickly as possible. If the frame finished exposing but the transfer schedule does not have bandwidth available at that time, the frame needs to wait. Meanwhile, the next frame is being exposed. Once bandwidth becomes available, the queued frames are transferred and the congestion created by the USB transfer schedule is resolved.
This explains why there are some frames with higher isi than others as well as why some frames arrive shortly after each other. Frame exposure is not blocked by the USB transfer schedule. Instead, the camera continues exposing frames while previous frames wait for their transfer.
I cannot explain it better than this. I really hope this makes some sense to you.
standard deviation in intersample gaze data is 4.209ms Please remember, that gaze timestamps are a subset of the eye video timestamps because gaze cannot be inferred at the same frequency as new eye video frames are being sampled. Therefore, it is expected that gaze timestamp std is higher than eye video timestamp std.
@papr @wrp Thanks - but I'm not sure this makes sense? If the eye camera is sampling at 200Hz and passing those timestamps on unedited through the pipeline, as you described above, then all intersample times should be some multiple of ~5ms even if it's a subset of those samples inherited in 30Hz world camera frames, I think? There are no samples with around 5ms intersample interval - so I am still as confused about where this is coming from, even though I understand the pipeline and subsampling of eye camera timestamps etc. It just doesn't make sense of the data we're getting and plotted above. Apart from that, could you confirm or refute the proposed reason why short and long intersample times would be present in 30Hz gaze data/real world frames due to extensive loss - is that proposal correct? We need answers to plan analysis of data due to be recorded from a clinical group in August - and we can't properly design the experiment without knowing which eye movement measures are valid - e.g. to what degree fixation duration measures are valid, or what the granularity is there, which is in turn entirely dependent on the spatial and temporal accuracy and precision...
Also, the gaze pipeline does not have a concept of time. It just takes two images as input and produces a 2d point in scene camera coordinate space.
If the temporal precision/sampling regularity is unknown up to tens of ms then we will have a hard time extracting e.g. fixation durations validly. If it's very irregular but known and measured we can still work with it to get fixation durations at the granularity indicated by 30Hz scene camera (+- 33ms, roughly - though as you can see, there are almost no gaze samples with that interval or close to it - which should be around the middle of the most common interval mode if the most recent eye camera sample's [email removed] [email removed] were inherited at 30Hz without extensive loss).
Here's the plot again, to save you scrolling:
well i have a question. I have to maps that my pupil player cant open. However the recordings are still intact i guess. Can someone help me
Hi, could you please clarify what you mean by "maps"? If you have trouble with opening a recording in Player please share the Home directory -> pupil_player_settings -> player.log
file such that we can check what the actual issue is.
2021-07-09 16:22:30,227 - MainProcess - [DEBUG] os_utils: Disabling idle sleep not supported on this OS version. 2021-07-09 16:22:30,612 - player - [INFO] numexpr.utils: NumExpr defaulting to 6 threads. 2021-07-09 16:22:39,483 - player - [ERROR] launchables.player: Process player_drop crashed with trace: Traceback (most recent call last): File "launchables\player.py", line 934, in player_drop File "pupil_recording\recording_utils.py", line 40, in assert_valid_recording_type File "pupil_recording\recording_utils.py", line 59, in get_recording_type File "pupil_recording\recording_utils.py", line 119, in _is_pupil_invisible_recording File "pupil_recording\info\recording_info_utils.py", line 90, in read_info_json_file File "json__init__.py", line 299, in load File "json__init__.py", line 354, in loads File "json\decoder.py", line 342, in decode json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 693)
Looking at the error message again, the issue could be related to something else as well. I could confirm this if you shared the info.json file with [email removed]
This is a known issue where Pupil Player is unable to decode a non-ascii character (likely part of the template data) in your info.json file. The fix will be released in our new v3.4 version next week. Until then, you can edit the info.json and remove non-ascii characters as a temporary workaround.
maybe a stupid question but how can i remove that
The json file is a structured text file that can be edited with any text editor. š
i send both the files
Thank you. The issue is indeed not related to non-ascii characters. You can fix the issue by opening the files in a text editor and deleting the last line (containing text similar to nfo.json"]}
).
Could you please also share the android.log.zip files of these recordings such that our team can investigate the cause of the issue?
send both
Thank you very much!
@papr Can you guys comment on the road-map for the invisible? Specifically, the ability to 1) calibrate to an external reference and calculate approximate EIH signals, and 2) the ability to filter based on some measure of confidence, or at least to detect and filter out blinks ?
SOS - I have a 53 minute recording that I have been annotating (I am 25 minutes in)... pupil player is not responding and my cursor is just spinning round and round. I do not want to force quit and loose all of my annotations. Is there something I can do? Does that data save automatically if I force quit?
When I export my pupil player files there is no fixation.csv ... why would this be? also there is no such thing as "fixation detector" on my pupil player.... in order to do my work I need the fixations and fixation durations. Can someone please help me?
Hi! Are you using a Pupil Invisible recording? Fixation detection is not available for Pupil Invisible within Pupil Player. The according plugin can not be enabled and the exports do thus not include the fixation csv file. Within the next couple of weeks we will release a fixation detector for Pupil Cloud though! Did the non-responding Pupil Player problem rewolve itself?
I guess what I am trying to say is I tested invisible out and fixations were present and when they did a testing model for me before I purchased the glasses they showed me the durations tab and how it would be what I needed for my research. So now that it is no longer an option for invisible is a concern.
Was this always the case? that there was no way to detect fixation points with invisible on pupil player because I was sold on the product that it did. Also, if a fixation detector is being put in pupil cloud will this allow me to code all of the fixation points in pupil cloud and download the data from there? I guess I am confused because pupil cloud does not have any of the annotations or data collection trackers so will this be a new feature that I am actually able to code data in pupil cloud now? The non-responding did not. I had to start all over.
Dear Pupil Labs engineers, I am wondering why the data frequency is different between companion and cloud for invisible? Is it just kind of ālicenseā issue, or any post-processing or ā(deep learning based) fillingā procedure happens in the cloud?
Hi @user-b89d82 š. Pupil Invisible's eye cameras operate and save video at 200 Hz. However, due to computational resources, we only perform real-time gaze estimation on the companion device at around 66 Hz. Once the data gets uploaded to Pupil Cloud we perform gaze estimation at the full 200 Hz! (It is nothing to do with licensing and/or deep learning-based filling).
Hi @user-d8879c š. Traditional fixation filters, like the one in Pupil Player, generally work well in static conditions, e.g. when the wearer is seated. However, they become less reliable in dynamic conditions, e.g. when the wearer moves freely in their environment. We chose to disable the dispersion-based filter for Pupil Invisible recordings because, a) we had not formally tested it with Pupil Invisible, and b) we are releasing a new velocity-based filter for Pupil Invisible that will provide more robust fixation classification in dynamic conditions, which are often expected with Pupil Invisible use. For reference, you can also make annotations within Pupil Cloud. Once you have added your recordings to a Project, you can insert events (annotations) in the 'Events & Sections' window (far right of screen)! The fixation detector will be added as a free update, and then you will be able to download fixation and annotation data from Pupil Cloud.
@nmt If you aim to go with velocity based event detection, then how will you deal with the temporal precision issues above? It seems there's quite a bit of uncertainty regarding why a 200Hz eye camera would only ever produce samples 4ms or multiples of 4ms apart, and also there seems to be uncertainty as to which of several possible eye camera samples and timestamps with arbitrary delay get inherited by the gaze sample stream? I'm still waiting for any information that would allow us to test many different event detection methods and see which one gets us closest to real behaviour - we could do this e.g. by dual recording with a high end eye tracker, and running the gaze PI data through various event detection algorithms. Thanks...
@nmt Thanks for clarifying. Isnāt it possible to do same in powerful enough system in local like core does?
Please see this message for reference https://discord.com/channels/285728493612957698/633564003846717444/861919877124456468
@papr continuing... https://discord.com/channels/285728493612957698/285728635267186688/864258349445283890
@nmt @papr https://discord.com/channels/285728493612957698/285728635267186688/864258349445283890
we want to take measures of e.g. cognitive load via fixation durations, effect sizes are not very large, so any explanation you can offer for why the gaze and eye samples are coming out in multiples of 4ms sampling rate, then? If the eye camera and original timestamps should be at 200Hz and unedited, then the graphs above which are intersample times for the data we've recorded with the IP do not make sense, neither does lag introduced - it's waaaaaay too regular. Please note the gaze isi here is from gaze data, not eye camera - that's below. This would have direct impact on choice of, for example, dispersion or velocity based event detection, as well as what granularity of eg fixation durations we can expect to validly measure, I think?
papr ā Today at 11:29 PM At the moment, I can only offer my previous response to this question https://discord.com/channels/285728493612957698/633564003846717444/862723198617649172
papr ā Today at 11:31 PM @user-359878 Also, these numbers are averages. The scene camera has a variable frame rate, too.
Human Bean ā Today at 11:32 PM There is no way the data above would lead to averages at 200Hz/5ms intersample time (isi), though?
It's a small point compared to which sample of many eye camera frames is actually inherited in the gaze data - though. This implies a much larger range than even the isi range, because at 200/250Hz eye camera data for 66Hz gaze data, which sample/frame is used to estimate gaze position being 'most recent' means that depending on loss rate, 'most recent' could be 4ms previous, right up to ~30ms previous or more...??? Or is there a way to know which of several possible samples contributing timestamps to gaze data was used to calculate gaze position?
There is no way the data above would lead to averages at 200Hz/5ms intersample time (isi), though?
If you look at my previous notebook looking at the ISI in detail https://nbviewer.jupyter.org/gist/papr/aa376b50909130c772b1608c51f827cd#Intersample-intervals-over-time
You can see the 1x 8 ms difference + 3x 4 ms difference pattern which I have explained in the linked message. That is a total duration of 20 ms for 4 samples. 20 ms total duration / 4 samples = 5 ms isi on average. That is exactly 200 Hz.
Or is there a way to know which of several possible samples contributing timestamps to gaze data was used to calculate gaze position? "which of several possible samples" - you are referring to eye video frames, correct? If this is the case, you know the answer to the question already: Each gaze position is calculated based on a binocular pair of eye images of which it inherits the left eye's timestamp. So yes, it is possible to know which of several possible (eye video frame) samples is contributing (image data and timestamps) to gaze data.
@papr @nmt Your notebook, and above, and every answer so far, seems to assume no loss of eye camera frames/gaze data inferred from eye camera frames, correct? So - they don't answer the central question I'm asking - what happens when "most recent" valid frame from eye camera is not the most recent frame? I dunno how to ask this any more clearly... I know the ideal case, that's what you've described, but I'm dealing with the real case of data with loss which is perfectly normal in small amounts in any eyetracker - but I need to know is there any way to tell whether the 'most recent' eye camera sample, contributing gaze data and timestamps to the gaze data, was the most recent or up to ~30ms ago when there is loss? If 'most recent valid' sample from eye camera is inherited in the gaze data, and not 'most recent' sample with 0,0 indicating loss, then there is considerable uncertainty above and beyond the 4ms actual sample rate according to timestamps from eye camera and why those highly variable isis would exist in the resulting gaze data at gaze sample rate, at all. There should be no isi's of 4 or 8ms in 60Hz/66Hz data, unless there's loss in the eye camera frames, and it's handled as I've described several times now for confirmation. You're answering another question I'm not asking - I realise it's frustrating but I'm just waiting for this very basic information - how is eye camera loss handled? Can you randomly remove some of the eye camera frames as producing no valid gaze data in your notebook example and describe according to that reality what happens in the gaze data stream?
If this temporal imprecision is not measured and reported, or possible to gauge for each sample - then a velocity based event detection must be ruled out because the velocity will be all wrong due to temporal imprecision. We are fine to figure it out once we know what's going on with the data. Thank you.
If gaze data is 66Hz, and inherits most recent sample from eye camera, then your isis should be somewhere between 17+/-4(or 5...)ms. As you can see in the plot above, the majority of isi's in the gaze data are, however, shorter than this by far. Is this due to loss of eye camera frames and inheriting 'most recent valid' sample through loss? This would explain the very short isi's as described many times, but I have no answer to whether that's what's happening. Thanks.
Could you please elaborate on your understanding of a valid vs an invalid eye video frame sample?
Valid=produces gaze coordinates, i.e. correct feature extraction. Invalid=no gaze coordinates, system missing or 0,0 (bad to use 0,0 to indicate loss as it is a coordinate, but some trackers do this nonetheless) gaze coordinates with a timestamp.
Ok. Every processed (as in received in the application from the camera) video frame and its corresponding timestamp ends up in the corresponding video + .time
files. These would included all, "valid" and "invalid" frames, as by your definition. The video recording process does not differentiate between "valid" or "invalid" frames.
Once the gaze inference process is ready for a new pair of images, it gets the most recent frames from the two eye video processing processes. At this moment, these frames become "valid" frames as by your definition. Not earlier.
If A = PI left v1 ps*.time
is the set of all timestamps, and B = gaze ps*.time
the set of all "valid" timestamps, then every timestamp present in A but not B would be an "invalid" timestamp.
That said, this is a highly problematic definition of valid/invalid frames because the recorded frames are fine and can be used for gaze densification in Cloud. I suggest using "eye video frames used for inference" vs "eye video frames not used for inference" (clumsy but more accurate/less ambigious)
It's OK if it takes a bit of time to answer this - but it would be great to know that you understand what I am asking!
As I said before - great piece of kit, we're just trying to max out what we can validly measure from it.
Fixations durations require clarity on this or they're not valid.
I think just taking your notebook illustration and changing it so that some eye camera frames produce no gaze data, and describing what happens, would suffice!
Thanks again
OK. So you are saying that there is no instance where the eye camera frames produce 'loss' - i.e. can not be used to infer gaze? i.e. 'most recent' L+R eye camera frame is always available for gaze estimation, meaning gaze data is always 17ms +0->5ms intersample time if you are not uploading to cloud, with no samples missing? This is what I understand from your reply above - and that would be good to know - but it still doesn't explain the actual intersample times of 4ms and multiples thereof in the gaze data....? What am I missing here? Thanks.
It is technically possible that the gaze inference process encounters a decoding issue, when decoding raw data from the image pair. In this case, the process drops the image pair (it will still be written to disk by the other process). The image pair basically becomes "invalid" again as by your definition. This should only happen very rarely and only in the context with physical connection issues or hardware-related camera issues.
The variance is more easily explained by the gaze inference process taking a variable amount of time to infer gaze from each image pair. The longer the gaze inference takes, the longer the gaze isi will be.
doesn't explain the actual intersample times of 4ms and multiples thereof in the gaze data.... This is just a result from the eye video software timestamps having a multiple-of-4ms isi. If the eye video timestamps had a multiple-of-11ms isi, so would the gaze data.
Getting closer to the problem now via reverse engineering - following plots very informative, see how regular the loss indicated by autocorrelation is. First up, velocity based event detection seems extremely difficult (and currently unwise...) for this data due to this temporal imprecision, and then you see the regular loss of eye camera samples (every 4th one?) and other things we would really like to understand which are quite strange...
@papr @nmt Maybe these graphs help add some clarity, I've included one similar to your notebook so you can see how your diagram compares to our recording. Please note that this data was not uploaded to cloud - so in gaze data - there should be 66Hz gaze position data/sampling with 'most recent' timestamp inherited from left eye camera adding some lag to the ~15ms sample rate implied. Please let me know if you see the problem, or if we should e.g. further analyse via dual recording to figure out what's going on. I disagree that if you 'inherit' or subsample 200Hz data's timestamps to 66Hz gaze data taking 'most recent' timestamp from 200Hz and pasting it to 66Hz gaze data, with eye camera loss being very rare, that you should ever have intersample times of 4,8,12ms in the gaze data. It should not be possible based on inheritance of more granular data's timestamps. That's not possible. I understand that issues with lag may cause delay, and may also cause a bunch of samples to be written close together - each of which adds significant temporal imprecision, and that may be more relevant here but there are multiple potential sources of error, so I've included the dataset we recorded too, so you can see why your very patient and appreciated replies don't explain what we're getting in the data. Thanks.
In the left eye camera, the temporal imprecision of sampling averages just under 2ms (Std. Dev.), with maximum loss indicated by just over 20ms between samples. When those timestamps are inherited/subsampled and pasted to 66Hz data, the time between samples ranges from 4ms to more than 20ms in this short, zoomed in period I've got in the second graph, most frequently they are 12ms apart, i.e. ~82Hz?! and in eventual gaze timestamps shown in graph 3, the most common time between samples is 12ms, followed by 16ms with nothing in between - quite tight distributions are modes that are multiples of 4ms. 66Hz is not inherited - there's no gaps of about 15ms between samples, 60Hz looks fairly common with ~16.66ms isi. Temporal imprecision of gaze data is just under 4ms (Std. Dev.). Over an e.g. 500ms fixation duration, this imprecision in sampling is additive over 33 samples gaze data, or 100 samples over eye camera frames.
the regular loss of eye camera samples (every 4th one?) I need to repeat myself. The 4th frame is not dropped or lost. It is just delayed due to USB transfer scheduling.
Well, that's even worse than loss - because you've gaze coordinates reported with the wrong timestamp, frequently up to 20ms or more per sample! correct?
Well, yes, I think we went over this. The recorded timestamps are the timestamps at which the frame data reaches the phone, not the time at which the frame was actually exposed.
PS - we can correct for ALL of this.
Yes - in other words - this is temporal inaccuracy, alongside temporal imprecision, but that isn't the ONLY source of error here - see above.
This would seriously mess up event detection. Temporal accuracy and precision is just as important as spatial accuracy and precision, especially since it was mentioned somewhere above that a velocity based event detection would be used.
We will figure out how to correct for all this, and do dual recordings to help demonstrate exactly what the effects of such temporal issues is for recording eye movements, and send all to you.
Meanwhile, the plots above are very telling!
They are based on the dataset I sent before.
If you look at the left eye camera timestamps in there, you can see the regular gap every 4 samples or so throughout.
That is exactly what I am referring to in this message https://discord.com/channels/285728493612957698/633564003846717444/864264578300510235
Well - it's not quite so systematic or the whole story, see plot above comparing previous lag to current lag....and the negative correlation...anyway - more later.
We'll attempt to deal with this in a similar way to the ioHub.
Could you link the ioHub reference/implementation?
It's part of the general psychopy release.
I mean the specific part that does this "correction". I am not too familiar with the psychopy code base. And skimming over https://www.psychopy.org/api/iohub/starting.html#iohubconnection-class I cannot find anything specific to timestamp correction.
Any thoughts from the paper sent previously, with artificial eye to remove processing variance?
It's done when the particular system is integrated, based on known pipeline.
So, from the 3 existing integrations https://www.psychopy.org/api/iohub/device/eyetracker.html do you know one which defintively has this correction included? I am just trying to find details about this correction
Here's a link to the original standalone version from our project, but it's long out of date - maybe there's documentation in there. https://github.com/isolver/ioHub
8 years is indeed a long time
Yep - another reason to integrate with ioHub now!
Researchers everywhere would be very happy...
š
We are working on that for Core š Btw, I am struggling to find best practices on how to use the ioHub API. Do you have a minimal example experiment that uses one of the existing eye tracker classes?
I didn't write that code, Sol is the guru on these things - so I rather just refer to his documentation, but if you want to integrate then contact him - he is usually exceptionally generous if you have the time to do the work...
I think the old repo above has such a minimal example, have a look and if you don't find what you need ask again and I'll dig something up.
The included demos look good for a start, indeed. Thanks!
Incidentally we've just used one of the switches Sol is building now with the IP tracker, so we get very carefully coupled response times with eye data. Works great and may be helpful for many who want e.g. a reliable common trigger for data streams - see here: http://labhackers.com/
and specifically the MilliKeyDelux Upgrade here: http://labhackers.com/millikey.html
the light sensor would remove all problems in having the tracker running separately to display, for example š
@papr does it seems reasonable that you might have 5ms per sample assumed in phone, when it's really something like 5.0065160515003005 ms in the camera, leading to drift? This would explain the 1ms shift in offset we're seeing every 153 samples (i.e. every 765 ms), but there are other sources of error too - still thinking, but confirmation of this would help.
Hello, what is PI's IMU format and units? would it be acce_x, y, z, rotation x,y,z? Thanks
Hi @user-bbf437! Please see the imu.csv
file here for format and units!
https://docs.pupil-labs.com/cloud/enrichments/#export-format-4
Hello Pupil Labs Team! Are the timestamps for the world images in local recordings set on the Pupil Invisible itself or on the phone? I mean, do the timestamps represent the exact time at which an image was made or is it the time when the image was made + the transfer time from the Pupil Invisible to the phone. And if it includes the transfer time, how big this latency could be?
The timestamps are not measured at frame exposure but on reception on the phone. The transfer time is about 30 ms for the eye cameras but is subject to variance. See my messages with @user-359878 for a very detailed discussion.
Thank you!
Hi! Is it possible that there could be also a time delay between the timestamps of the gaze data and the IMU data? In the figure the head rotation precedes the change in eye-movements, but normally it is the other way around.
Has there been any updates on methods for correcting these timestamps?
Unfortunately, not.
Hi, I have taken some recordings using the Invisible and now have them in Pupil Player, I can only seem to get the 'Vis Fixation' to show up, and not the 'Vis Cross' or any of the other 'Vis' options. Any Tips? Thanks in advance!
Mmh, I just try reproducing the issue but I can't. The other visualizations show as expected. Can you reproduce the issue for every recording of yours or only the one?
I just tried it again with another recording and having the same issue. In the photo you can see I have 'Vis Circle', 'Vis Fixation', and 'Vis Cross' turned on but only Vis Fixation is showing up. Perhaps its a setting I have to adjust on the pupil companion app? Tomorrow when I'm back in my lab I will have a look.
Actually, could you go to the general settings and setting the minimum data confidence to zero?
That worked! Thank you very much!
ok, that means, that your gaze data has a confidence of zero. This only happens if the headset on/off classifier estimates that the headset is not being worn at that time. Could you share a screenshot/picture of Pupil Player with the eye video overlay enabled?
ok, thank you!
Which version of Player are you using? Would it be possible for you to share the recording (or any recording that shows this issue) with [email removed] I would be interested in finding out was went wrong here.
I am using version 3.4.0 on MacOS, but I was having an identical issue with the lab computer which is Windows 10. I downloaded Pupil Player today on both machines so I assume its the most recent on the PC as well. Sure, I'll send an email, any reference I should add in? Since the whole recording is nearly 1GB I'll send a Google drive link.
I am using version 3.4.0 Ok, perfect! Sure, I'll send an email, any reference I should add in? That would be great! Just mentioned the issue briefly. Sending a gdrive link is fine with us.
Thank you, I just sent off an email then.
Hi š Is there a fundamental reason why it is unavailable for pupil invisible ? (I went through the code very quickly and did not find anything exotic about the invisible glasses i.e. unavailable data) I came across a voluntary disable around line 280 of fixation_detector.py so I am guessing I'm missing something. Could you enlighten me please ? If it's not the case and it is just not a priority at the moment due to lack of time, feel free to tell me. I might consider stepping into the code!
Hi @user-e0a93f! Thank you for the offer! We have recently disabled the fixation detector for Pupil Invisible recordings because it does not perform well. The Pupil Invisible gaze data is more noisy compared to the Pupil Core gaze data, which the algorithm is not designed to cope with. We have however been working on a new fixation detection algorithm that is 1) compatible with Pupil Invisible and 2) more accurate in general compared to traditional algorithms. We will release this algorithms to Pupil Cloud probably still within August and a bit later around September to Pupil Player!
Thank you very much! This feature will be a game-changer for me š While I am at it, will you implement saccades and smooth pursuit detection?
It's something we are looking into, but doing that robustly is of course much harder. I wouldn't expect a release anytime soon!
Ok thanks!
Your welcome!
@user-2d66f7 FYI - my team has also found temporal offsets between gaze and IMU data. PL is aware and actively trying to address the issue. The offset is not constant between recordings, and perhaps not even within a recording.