Pupil Labs Chat Archive - #🕶 invisible 2021/06

user-b10192 01 June, 2021, 00:02:33

Hi there. (1) The pupil invisible app is crashing when we use the video sensor from NDSI network. How do we avoid that? (2) Are we able to get a timestamp for the eye cameras? or How are these determined in the pupil_positions.csv as pupil_timestamp? We took the screenshots when the app was crashing.

user-0f7b55 01 June, 2021, 07:24:57

What controls are you trying to access over NDSI?

user-b10192 01 June, 2021, 02:28:50

user-b10192 01 June, 2021, 02:29:38

papr 01 June, 2021, 07:57:46

Regarding your second question: Pupil Invisible Companion publishes a timestamp with each video frame in nanoseconds since Unix epoch. This timestamp is converted to seconds in pyndsi (Note: This transformation looses some timing precision).

To access eye camera timestamps in realtime you need to access the eye camera video streams via NDSI.

Pupil Player realigns Pupil Invisible timestamps to the recording start (by subtracting start_time from info.json) before converting the nanoseconds to seconds to keep the full timing precision. In other words, the timestamps stored in the _timestamps.npy files are no longer in Unix epoch but relative to recording start. You can either read these files using the Python function numpy.load() or use the eye video exporter in Pupil Player which also generates a CSV file containing all eye video timestamps.

Note: Pupil Player does no longer export a pupil_positions.csv file for Pupil Invisible recordings, since it does not include any pupil data (only gaze data).

user-b10192 01 June, 2021, 21:49:49

Thanks for the answer

user-997dee 01 June, 2021, 08:27:50

Hi Marc, thank you for your quick reply! I just sent some data along with what I think happened during the crashes.

marc 01 June, 2021, 08:43:17

Thanks @user-997dee! I'll check it out and get back to you!

user-d1dd9a 01 June, 2021, 11:54:32

What does pts (points?) mean in the world_timestamp.csv ?

papr 01 June, 2021, 11:57:56

Hi 🙂 pts stands for presentation timestamps and refer to the video files internal time information. You can use it to programmatically seek to a specific video frame within the video.

user-d1dd9a 01 June, 2021, 12:13:08

Oh thx. And the world_index in gaze_position refers to the particular video frame| line in the world_timestamp.csv ?

papr 01 June, 2021, 12:13:26

correct!

user-d1dd9a 01 June, 2021, 12:16:58

Is the export of "gaze_position" limited to confidence 1.0 only?

papr 01 June, 2021, 12:19:11

Pupil Invisible data only has two possible confidences: 0.0 and 1.0. It is based on the worn *.raw files which is generated by an algorithm that estimates if the subject is wearing the glasses or not. The export is not limited to a confidence threshold.

user-d1dd9a 01 June, 2021, 12:28:37

Ah o.k. so i have only two confidence levels. But when i set the minimum data confidence in pupil player to 1.0 i have no vis circle.

user-d1072e 12 April, 2022, 19:46:26

Hi, just a quick question, where can I find this information, about "Pupil Invisible data only has two possible confidences: 0.0 and 1.0"? Because I remember Core still has all values in the range, but there seems to be no announcement about this binary confidence score for Invisible.

papr 01 June, 2021, 12:30:03

Correct, because Player only displays data with confidence larger than that value, not larger or equal.

user-d1dd9a 01 June, 2021, 12:30:51

o.k. 🙂

user-d1dd9a 01 June, 2021, 12:38:00

I see in gaze_points that sometimes gazepoints at different times refer to the same video frame. How is that treated in the video?

papr 01 June, 2021, 12:38:25

Can you specify which video you are referring to?

papr 01 June, 2021, 12:40:18

Pupil Invisible records gaze data at a higher frequency than the scene video. As a result, you get multiple gaze estimations per scene video frame. Player displays all gaze points that are assigned to the current frame. Cloud displays the temporally closest gaze point to the current frame.

user-d1dd9a 01 June, 2021, 12:42:37

Thx again.

user-d1dd9a 01 June, 2021, 12:49:10

Is there a heuristic that pupil invisible is using to determine on which tracking rate it can currently drive? Or is that fixed value?

marc 01 June, 2021, 12:50:41

It is always using the maximum capacity available on the phone. Background processes of the Android OS or the usage of other apps can lead to slight fluctuations.

user-d1dd9a 01 June, 2021, 12:56:18

O.k. i ask because i read that based on the companion app without cloud processing it works only with 55Hz. When i look in my gaze_position i have currently 43Hz.

user-1391e7 01 June, 2021, 13:11:25

Is this the right place to ask about pupil invisible monitor?

user-d1dd9a 01 June, 2021, 13:28:29

you should see a green o.k. button on the grey monitor. By clicking on that you get the video from the scene camera of the pupil invisible with ovelayed gazepoint.

papr 01 June, 2021, 13:13:22

yes

user-1391e7 01 June, 2021, 13:13:46

very nice. it likely is a simple thing I'm doing wrong, but I'm not certain.

user-1391e7 01 June, 2021, 13:14:08

both phone and laptop are in the same local wifi

user-1391e7 01 June, 2021, 13:14:41

I start pupil invisible monitor and I can see in the shell that it is indeed finding the phone as peer

user-1391e7 01 June, 2021, 13:14:52

last message I see is it sends HELLO to peer

user-1391e7 01 June, 2021, 13:15:13

and the pupil invisible monitor window stays grey, the device to connect to does not show up

user-1391e7 01 June, 2021, 13:16:29

so my question is, what should be happening if it was working. normally, the device to connect to would show up or?

user-1391e7 01 June, 2021, 13:17:49

laptop is on windows, pupil monitor 1.3

user-1391e7 01 June, 2021, 13:28:59

hm, okay. then something may be blocking communication still.

marc 01 June, 2021, 13:29:44

@user-1391e7 Are you using a public wifi, e.g. university wifi?

user-1391e7 01 June, 2021, 13:29:52

nono, it's a local router set up in the lab

papr 01 June, 2021, 13:30:09

Generally, the glasses need to be connected to the phone in order for the device to be listed in Pupil Invisible Monitor.

user-1391e7 01 June, 2021, 13:30:46

they were connected yes. I'm suspecting something on the laptop's end at this point

papr 01 June, 2021, 13:31:06

Feel free to share the ~/pi_monitor_settings/pi_monitor.log file with us

user-1391e7 01 June, 2021, 13:31:20

oh there's a log! very good, let me check 🙂

user-d1dd9a 01 June, 2021, 13:32:15

does it work in a uni wifi ?

papr 01 June, 2021, 13:32:51

usually, it does not, because the network is set up to block peer-to-peer communications for security reasons

user-d1dd9a 01 June, 2021, 13:33:26

o.k. because i tested it and it doesnt work.

user-1391e7 01 June, 2021, 13:33:44

that's sound, it should be blocked on an open network 🙂

user-1391e7 01 June, 2021, 13:34:10

but I'm not seeing a pi_monitor_settings folder

papr 01 June, 2021, 13:34:20

Which OS are you on?

papr 01 June, 2021, 13:34:44

On Windows check C:\Users\<user>\pi_monitor_settings

user-1391e7 01 June, 2021, 13:36:06

windows 10, okay got it ty 🙂

user-1391e7 01 June, 2021, 13:38:55

pi_monitor.log

papr 01 June, 2021, 13:40:29

Can you confirm that your phone's ip address is 192.168.1.6?

user-1391e7 01 June, 2021, 13:40:41

yes, that's the correct device

user-1391e7 01 June, 2021, 13:41:13

if the tcp connection was getting blocked, would I get to the HELLO send?

user-d1dd9a 01 June, 2021, 13:49:13

.....is the companion app still running and you logged in?

user-1391e7 01 June, 2021, 13:41:54

I'm pretty sure I (temporarily, for testing purposes) allowed any and all connections on that local network

papr 01 June, 2021, 13:45:35

Your device (2a14de) is sending the HELLO instead of receiving it (from the phone 8811ac67). Looks like the phone is not responding.

user-1391e7 01 June, 2021, 13:49:46

yes. I could log out and in again, if that is a concern

papr 01 June, 2021, 13:52:36

Please also make sure to use the latest release of the app.

papr 01 June, 2021, 13:51:24

This is an example log when the app is running on the phone but the glasses are not connected to the phone:

2021-06-01 15:49:39 [DEBUG] Connecting to peer 4de0a4a3-8f49-f95d-f6d1-47a4b34c4c9c on endpoint tcp://192.168.1.72:49152
2021-06-01 15:49:39 [DEBUG] 1bba6a send HELLO to peer=notset sequence=1
2021-06-01 15:49:39 [DEBUG] (1bba6a) recv HELLO from peer=notset sequence=1
2021-06-01 15:49:39 [DEBUG] (1bba6a) ENTER name=OnePlus 6 endpoint=tcp://192.168.1.72:49152
2021-06-01 15:49:39 [DEBUG] (1bba6a) JOIN name=OnePlus 6 group=pupil-mobile-v4
2021-06-01 15:49:39 [DEBUG] (1bba6a) recv WHISPER from peer=OnePlus 6 sequence=2
2021-06-01 15:49:39 [DEBUG] (1bba6a) recv WHISPER from peer=OnePlus 6 sequence=3
2021-06-01 15:49:39 [DEBUG] Unsupported sensor type: event

user-1391e7 01 June, 2021, 13:53:35

I am as well! for once 🙂

user-1391e7 01 June, 2021, 13:54:15

1.2.0-prod

user-d1dd9a 01 June, 2021, 13:56:24

....make your host to an hotspot and connect with the comapanion device...test again.

user-d1dd9a 01 June, 2021, 13:57:23

i tried it that way on windows 10 / 64bit today without problems.

user-1391e7 01 June, 2021, 14:08:03

so I went ahead and disabled and reenabled all network adapters

user-1391e7 01 June, 2021, 14:09:34

extracted pupil monitor again

user-d1dd9a 01 June, 2021, 14:09:56

the video timeline in pupil player or better the graph of the recorded fps...what can i see here or how is that interpreted?

papr 01 June, 2021, 14:12:42

It displays f(x) = 1/(ts(x) - ts(x-1)) where x is the frame index. It is the inverse of the consecutive time difference. The unit is Hz.

user-1391e7 01 June, 2021, 14:10:13

so windows firewall asks again in case it tries to connect, which it did, just like the first time

user-d1dd9a 01 June, 2021, 14:11:02

nice

user-1391e7 01 June, 2021, 14:10:20

then I hit accept and it works.

user-1391e7 01 June, 2021, 14:11:09

I also did the log out log back in thing on the phone.. idk. for now I'm glad it works, but I can't explain why it didn't or does now

user-1391e7 01 June, 2021, 14:13:26

I'll try and reproduce it tomorrow. thank you for the assistence

user-1391e7 01 June, 2021, 14:13:39

much appreciated

papr 01 June, 2021, 14:13:54

You are welcome.

user-d1dd9a 01 June, 2021, 14:17:08

Thx.

user-b10192 01 June, 2021, 21:57:49

We are using 2 recording methods for our research. One with our main computer and one with pupil invisible. We are trying to control starting and stopping the pupil companion app with NDSI to synchronize with our main camera recording. We are also using 2D pupil detector from pupil labs by accessing left and right video sensors to ensure our participants' eyes are in correct position. The app crashed sometimes while in recording and steaming the left and right eyes videos.

user-0f7b55 02 June, 2021, 07:35:03

do you mean you are using pupil capture with pupil invisible companion app ?

marc 02 June, 2021, 07:58:05

@user-b10192 Do I understand your approach correctly that you are trying to stream the eye videos to Pupil Capture in order to do Pupil Detection there, which you want to use to assess if subjects are wearing the Pupil Invisible glasses correctly? A quick comment if that is the case: Pupil detection does generally work poorly on Pupil Invisible and a bad pupil detection signal does not necessarily correlate with a bad gaze signal. The "optimal" way of wearing the glasses is usually just putting them on intuitively and manual optimization may even harm the results.

Nonetheless, streaming the eye videos to capture should work. However actions like changing the resolution of the eye cameras may not work. Could you share your pupil_capture_settings/capture.log file with us after reproducing the error once more so we can have a look at what went wrong?

user-b10192 02 June, 2021, 22:52:16

We are using 2D pupil detector from pupil lab because we got low pupil confidence recording even when we think the glasses is in correct position on the participant face. So that we can measure rough pupil confidence with 2D pupil detector before and during the recording. I will share capture.log when we have error and crash again. Yesterday I used both main computer and invisible at the same time for 3 recording. Each takes 8 mins but it did not crash. I turned off OTG and plugged in the glasses wire and turn on OTG. Previous we are plugging off and on the glasses wire without turning off and on OTG. I will do another stress test today again. If it is still crashed then I will share the capture.log. Thank you so much for your help ❤️

marc 03 June, 2021, 08:06:08

we got low pupil confidence recording even when we think the glasses is in correct position This sounds like you are trying to use Pupil Capture to calculate the gaze signal for a Pupil Invisible device. This can not work well! ⚠️ Pupil Invisible has to be plugged into the Companion device and the gaze signal needs to be computed by the Companion app. If you need the gaze data in real-time on the computer as well, you can stream it over wifi from the phone. Pupil detection will not work robustly even if the Pupil Invisible glasses sit perfectly on the subject because the eye camera angles are too extreme. The pupil is not even visible in the images all the time!

user-b10192 02 June, 2021, 23:44:05

I am looking for the "pupil_capture_settings/capture.log". When can I find it? Please guide me

papr 03 June, 2021, 07:25:21

You can find the folder in your home direcotry. On Windows, you can find it under C:\Users\<user>\.

user-f2d032 03 June, 2021, 14:54:34

Hi there, we are considering to buy the ‘invisible’ eye tracker for a pretty ambitious experiment. In order to decide whether ‘invisible’ is the right system for our purpose we are looking for users who would be so kind to share their experience with us – preferentially in a brief (!) online meeting. We would appreciate if you would contact me via email [email removed] . Bet wishes Tobias

user-b14f98 03 June, 2021, 17:02:53

Tobias, perhaps it would help if you explained your use case a bit more?

user-f2d032 07 June, 2021, 09:25:02

Good point, we plan to record for 90 -120 minutes continuously mostly indoor but also outdoor. One question to experienced users woudl be how often they need to re-calibrate. The more important questions concern the analysis of such a large data set, e.g. ways to reduce the data for instance by eliminating times where no fixation was identified etc..

user-94f03a 04 June, 2021, 05:30:46

Hi Tobias happy to discuss

user-f2d032 07 June, 2021, 09:28:44

Great, woudl you be okay with a brief zoom (or whatever) meeting? We are mostly interested in how to deal with large data sets (90 -120 minutes recording), i.e. which features of the software might help, how easy it is to reduce the data yourself e.g. by eliminatign no-fixation periods etc. But also how stable is the calibration.

user-98789c 04 June, 2021, 11:08:39

It's not possible to change the resolution of Pupil Invisible using Pupil Capture, or at all, right?

papr 04 June, 2021, 11:43:46

Correct, it is not possible.

user-b10192 06 June, 2021, 22:38:50

We are using pupil companion app with mobile. Not windows. How can I find it on the mobile? Thanks

marc 07 June, 2021, 08:14:03

The Pupil Invisible Companion app does not have this log file. You said you were doing 2D pupil detection, so we figured you are using Pupil Capture or Player. How are you doing pupil detection?

user-98789c 07 June, 2021, 08:12:16

where can I find these setup instructions?

marc 07 June, 2021, 08:12:46

Please see here https://docs.pupil-labs.com/cloud/enrichments/#setup-2

user-98789c 07 June, 2021, 08:49:28

what are the time sync settings to be able to record with Invisible using Pupil Capture?

papr 07 June, 2021, 09:03:04

Pupil Capture is not suitable for Pupil Invisible recordings. Please use the Companion app instead.

user-b14f98 07 June, 2021, 12:01:23

I'm also happy to have a brief chat, if you would like. We have some experience with this sort of thing (I am Gabriel Diaz): https://www.nature.com/articles/s41598-020-59251-5 .

user-f2d032 07 June, 2021, 13:38:11

That woudl be great, your paper looks really interesting. Thanks a lot. If possible I would try that also my two colleagues coudl join the brief meeting. Could you therefore myabe suggest 2 or 3 time slots? We are 6 hourr before your time. Thanks

user-331791 07 June, 2021, 12:36:14

Hi everyone! We require real-time gaze, fixation, surface data, is this possible with the Pupil Invisible glasses? From the example online (Invisible/Network API) it seems the Invisible API only exposes gaze, left eye image, right eye image, world iamge and IMU data. Thanks!

marc 07 June, 2021, 12:42:51

Hi @user-331791! You are correct, the API of the Invisible Companion app currently only exposes gaze data, IMU data, eye videos and world video in real-time. You could however use Pupil Capture to stream the world video from the app and generate surface data in real-time there. We are currently in the process of finishing up a fixation detector for Pupil Invisible, but this will at least initially only be available post-hoc. Could you share what your application for real-time fixation data is?

user-331791 07 June, 2021, 12:50:33

Thanks Marc, that's very helpful! Streaming the data to Pupil Capture should work, can you point me to an example? It's for an experiemnt we are planning to do in the next few months. I'd say the surface tracking is more important than the fixation but it would be nice to have. Do you have a rough estimate of when the fixation detector could be available?

marc 07 June, 2021, 12:59:45

@user-331791 To stream the Pupil Invisible video to Pupil Capture you simply need to go to Video Source -> Activate Device -> OnePlus 6/8 in Pupil Capture.

Please note the following constraints for the streaming:

Pupil Capture and Pupil Invisible Companion devices need to be connected to the same WiFi network.

Public WiFi networks often block the required communication protocols and typically they can not be used.

Some anti-virus programs may block the required communication protocols by default.

We recommend using a dedicated wifi router for low latency and ease of use.

Regarding the fixation detection I expect a release within the next ~6 weeks.

user-331791 07 June, 2021, 13:02:42

Ok thanks! I'd need to do some processing of the gaze and surface data, how do i get the data from Pupil Capture into a python script for example, is a plugin the best option?

marc 07 June, 2021, 13:16:08

To get real-time gaze data mapped onto a surface you need to do the following:

Stream the video data from Invisible Companion to Pupil Capture as described above.
Stream the gaze signal from Invisible Companion to Pupil Capture using the Pupil Invisible Gaze Preview plugin: https://github.com/pupil-labs/pi_preview
Define your surfaces as usual in Pupil Capture.

Pupil Capture will then map the gaze data onto the surface and publish the results of that again through it's own API, such that you can receive that in your script.

I need to mention that we have not tested the Pupil Invisible Gaze Preview in a while. I can't think of any changes that would affect it, but let us know if anything does not work! Also, note that it is critical to stream the gaze signal generated on the Companion app. Although you can technically stream the eye video to Pupil Capture as well and use the Pupil Core gaze pipeline on them there, the results would not be good as the algorithms are not compatible with the difficult camera angles of Pupil Invisible.

user-331791 07 June, 2021, 13:18:45

perfect thanks marc! I'll try it out and let you know! Thanks again!

user-b10192 07 June, 2021, 22:42:38

We are using 2D pupil detector from this link. https://pypi.org/project/pupil-detectors/ It mentioned that it is from pupil labs.

marc 08 June, 2021, 08:23:40

Okay, that is the same pupil detector that is also used in Pupil Capture and Player. This algorithm is not compatible with Pupil Invisible and it is expected that the returned results are of low quality as the pupil is often not really visible in the images of Pupil Invisible. Would it be possible for you to use the real-time gaze signal coming from the Companion app? Is there a specific reason why you are trying to check if the glasses sit correctly on the subject?

user-1391e7 08 June, 2021, 09:42:15

Would you say the same applies for blink detection?

marc 08 June, 2021, 09:55:17

Yes, the blink detection algorithm of Pupil Capture is using the pupil detection confidence signal to detect blinks (no confididence -> no pupil visible -> blink). For Pupil Invisible the pupil detection confidence would often be low despite the eye being wide open and thus the detection would yield wrong results. We are currently working on a blink detection algorithm that is compatible with Pupil Invisible, but I can't give an exact ETA yet.

user-1391e7 08 June, 2021, 10:06:07

no problem at all and thank you for the information 🙂

user-1391e7 08 June, 2021, 10:08:02

got another field study tomorrow. to further protect the glasses from excessive sweat, I'm calling an 80s revival and will be putting headbands on the users

marc 08 June, 2021, 10:08:38

That sounds like a great call 😄

user-1391e7 10 June, 2021, 07:45:55

worked like a charm. although one of the participants did get it to crash by accidentally disconnecting the world cam when lifting the glasses off his head

user-1391e7 10 June, 2021, 07:46:03

the data itself was fine, the recording stayed

user-1391e7 10 June, 2021, 07:46:32

just the app did no longer respond to input until forced to close

user-1391e7 10 June, 2021, 07:48:14

just happened the one time though, just a little scary in the moment 🙂

marc 10 June, 2021, 08:17:07

@user-1391e7 Thanks for the report! Happy to hear everything went well besides that hick-up. Did this happen on the newest version of the app 1.2.0?

user-1391e7 10 June, 2021, 08:17:28

yes, newest version, 1.2.0

marc 10 June, 2021, 08:18:17

Okay, thanks for info. I'll forward that to our Android team to look into! 👍

user-1391e7 10 June, 2021, 08:21:12

you know how there is this little animation where you visualize the recording status? the circular border that fills and then loops around continuously? that animation tried to catch up, and at the same time, it probably wanted to display that the sensor is now missing and that's where something might go awry

user-1391e7 10 June, 2021, 08:22:59

recording time was ~10 minutes, I turned off the screen right after starting the recording, turned it back on when it was time to stop and save the recording.

marc 10 June, 2021, 08:27:25

@user-1391e7 Thanks for the info. Do I understand correctly that this happened while the recording was still running? In that case, the recording should contain a log file with information about that error called android.log.zip. Could you share this file or even the entire recording with us through [email removed] so we can try and take a look at what went wrong?

user-1391e7 10 June, 2021, 08:34:31

the recording itself I can't share due to the GDPR and other restrictions, but the log file I can try and find this afternoon and send it

marc 10 June, 2021, 09:45:46

@user-1391e7 Great, thanks! Could you add the info.json file as well? Maybe slightly censored if any of the contained information is sensitive. That would help interpreting the logs.

user-9d72d0 10 June, 2021, 12:54:17

Hello all! I have a question regarding the time synchronization using ntp servers. I found the documentation here: https://docs.pupil-labs.com/developer/invisible/#time-synchronization 1) Is there any preferred app to use as ntp client? 2) as far as I understand, the timestamps will be stored relatively to the sensor start. Is the absolute system time of the sensor start saved somewhere in order to make use of the synchronized system time via NTP?

marc 10 June, 2021, 13:01:25

Hi @user-9d72d0! All timestamps are saved as absolute UTC timestamps rather than measuring time from sensor start. The time sync is syncing the phones clock to UTC and this is done by Android itself, so no extra app is needed. You can force a time sync in Android like this:

- Go to date and time settings in: Settings → ( Android 9:  System ) → Date & Time
- Disable “Automatic date & time”
- Set the time to a random time (might not be required but visually confirms the sync)
- Reenable “Automatic date & time” while the phone is connected to the internet (phone will show NTP synced time)

On other devices, e.g. a laptop, the sync would also be done through the OS settings.

user-9d72d0 10 June, 2021, 13:02:13

Thank you very mich for the quick and detailed answer! I will check this

user-df9629 10 June, 2021, 17:18:24

Hi, I have a few questions 1] Is there a known temporal offset in retrieving head pose using April tags? (I have started noticing inconsistent temporal offset) 2] How are the timestamps for calculating head pose using April tags being handled? Thanks!

papr 10 June, 2021, 17:44:56

Could you give an example that visualizes the issue? I am not sure what you mean. Apriltag detections inherit the timestamps from the scene video frame in which they were detected.

user-df9629 11 June, 2021, 13:58:02

Sure. Here, I am plotting the raw data I get from the IMU along with the head poses that are exported by Pupil Player. I tried to match the time I get for head pose to the scene camera timestamps. The head motions observed in the IMU data shows up later in the april tag position and rotation. Any insights?

user-d8879c 12 June, 2021, 17:15:17

I am unable to download the recording..... i have tried 3 different internets but it still wont work. (i updated the phone and app) please help

nmt 13 June, 2021, 07:34:20

Hi @user-d8879c. Please try logging out of the App and logging back in.

user-d8879c 12 June, 2021, 17:54:08

the android file transfer also does not work

marc 14 June, 2021, 07:59:35

@user-d8879c Are you trying to upload recordings to Pupil Cloud or download them from the phone via USB?

nmt 13 June, 2021, 07:34:45

You can read further instructions for local transfer here: https://docs.pupil-labs.com/invisible/user-guide/invisible-companion-app/#local-transfer

user-dd0253 13 June, 2021, 09:39:12

Hello, I am a new Pupil Inv. user and I recorded my first video, uploaded in the cloud and I created a new enrichment project but I don't see any gaze or eye tracking after post-processing. Why? What was wrong? thanks

wrp 14 June, 2021, 03:39:30

Hi @user-dd0253 did you see gaze data in the Companion App preview? Did you see eye icon/track in the home screen of Pupil Invisible Companion?

user-3b040b 15 June, 2021, 09:12:24

Hello! I have a question for the invisible lenses. Is it possible to remove the lenses while using completely? Or are they necessary for functioning?

wrp 15 June, 2021, 09:14:00

You can remove lenses completely. Lenses do add some rigidity to the frame, however functionality should be just the same. Out of curiosity, what's your application?

user-3b040b 15 June, 2021, 09:15:55

Great, thank you. I am doing color research in lighting and all filtering in front of the visual system of a subject would have to be characterised or justified. So it would be easier to remove them if possible.

user-b14f98 17 June, 2021, 16:09:54

I'm following this thread closely (JoBot is RIT!)

papr 17 June, 2021, 18:05:13

Yeah, sorry, I was not able to look at this in detail 😕

user-b14f98 17 June, 2021, 18:23:29

You used past tense there - do you plan on looking at more detail, or would you prefer that we run some tests? We can measure the latency using corss correlation between april tag data and IMU data.

papr 17 June, 2021, 18:30:38

I was not able to understand the graph yet. I would like to spend some time understanding it. Maybe there is an easy explanation for what you are seeing. Please also have a look at my DM

user-b14f98 17 June, 2021, 18:24:14

Our current understanding is that they are desynchronized. THere is a multi-second latency.

user-b14f98 17 June, 2021, 18:31:08

Ok! If you would like to meet with Jeff Pelz and Anjali and I on zoom, we can schedule a quick meeting in the near future.

user-b14f98 17 June, 2021, 18:31:44

also, in a zoom now so I will be delayed in my response (gotta refocus, back in a few!)

user-df9629 17 June, 2021, 18:32:11

BTW, I am Anjali 🙂

papr 17 June, 2021, 18:32:35

Hi Anjali 👋 nice to meet you

user-df9629 17 June, 2021, 18:33:28

Hello @papr , nice to meet you too! 👋

papr 17 June, 2021, 18:59:08

Could you let me know what timestamps you used for the graphs above (original VS player) and how you extracted the Apriltags location/timing?

papr 17 June, 2021, 19:05:01

Or did you use the Cloud video / raw data exporter?

user-df9629 17 June, 2021, 19:28:19

I first downloaded the recording from cloud, loaded it in the Pupil Player, used the Head Pose Tracker plugin, and then used raw data exporter. In the image above, x-axis is raw IMU timestamps. For April tags, I converted the timestamp to nanoseconds and then to uint64. I then added the first scene camera / world timestamp to the modified april tags time (to come at par with raw IMU timestamps).

papr 17 June, 2021, 19:31:25

I then added the first scene camera / world timestamp to the modified april tags time That transformation is inaccurate. Player subtracts the start_time from info.json before converting nanoseconds to seconds, not the first world timestamp

papr 17 June, 2021, 19:32:08

And since there can be a startup time between the user hitting the record button (ts stored as start_time) and the scene camera actually starting to record, this would already explain your delay.

user-df9629 17 June, 2021, 19:35:27

So I should be adding the start time from info.json, correct?

papr 17 June, 2021, 19:35:35

correct

user-df9629 17 June, 2021, 19:35:45

On it. I will try it right away

papr 17 June, 2021, 19:36:11

I am reproducing your graph with a short example as we speak, too

user-df9629 17 June, 2021, 19:51:53

Surprisingly, the scene camera's first time stamp is bigger (or earlier) than the start_time from info.invisible.json 😐 The current plot very similar to the one I already had. I will run it on other recordings and see if it is consistent

papr 17 June, 2021, 19:52:29

if the timestamp is bigger, it means it happened after start_time

user-df9629 17 June, 2021, 19:53:42

Oh yes. Earlier the time, smaller the value. My bad.

user-df9629 17 June, 2021, 20:07:37

I still see some lag

papr 17 June, 2021, 20:08:55

For my experiment, I am using the marker detections that can be exported via Player v3.3. That makes the movement a bit more clear

user-df9629 17 June, 2021, 20:07:49

(Zoomed out)

papr 17 June, 2021, 20:09:06

I am nearly done.

user-df9629 17 June, 2021, 20:10:58

My Player version is 3.2.20 I await your results!

papr 17 June, 2021, 20:17:33

/cc @user-b14f98

papr 17 June, 2021, 20:19:50

The sync looks pretty decent to me.

papr 17 June, 2021, 20:23:38

Cleaner upload

papr 17 June, 2021, 20:36:50

For reference, here is the source code and the exported csv data https://gist.github.com/papr/b6a13673371b3c4941aaada20eaf0547

user-df9629 24 June, 2021, 20:04:43

Thank you so much. I applied the same processing on my data and it looked good until I zoomed in on it .

user-b14f98 17 June, 2021, 22:40:13

THanks a lot, @papr ! We'll have a look and get back to you.

user-16e6e3 18 June, 2021, 12:30:07

Hi,

I have a question regarding IMU data as well. From my understanding, to be able to interpret relative head pose data, you need to know where the head is in relation to the room at timepoint zero (so at the very beginning of the measurement, after we hit the record button). With our current setup, we don't know where the head was at time zero, but we do know where it was let's say at timepoint 5. Is it possible with the IMU data, to consider the head position at timepoint 5 as "the original position" and look at the data that comes after it as relative to this position? Or does it necessarily have to be relative to the very first head position at timepoint 0? Thanks!

nmt 18 June, 2021, 13:39:33

Hi @user-16e6e3. When you refer to relative head pose, do you mean relative position of the IMU? It's just that the IMU records angular velocity and linear accelerations, not position.

papr 18 June, 2021, 13:27:27

Since it is a relative coordinate system, you should be able to use any timepoint as a reference.

user-16e6e3 18 June, 2021, 13:47:32

@nmt I guess I meant the distance rather than position. We multiplied the time interval (5 ms, since it's 200 Hz) by the angular velocity for each measurement point to get the angle in degrees (i.e. an approximation of the distance traveled by the head in 5ms).

nmt 18 June, 2021, 13:50:03

Ah okay. About which axis are you applying this calculation?

user-16e6e3 18 June, 2021, 13:55:22

Around the y-axis. We are interested in horizontal head movements and would like to know where the head was at any given timepoint during the experiment.

nmt 18 June, 2021, 14:00:29

I think you may already be aware of this, but just to reiterate, if you numerically integrate angular rate, you will see drift that gets bigger with time. This is due to the finite sampling frequency associated with any real-world gyro. I would also consider using the Head Pose tracker for more accurate rotations about the yaw axis: https://docs.pupil-labs.com/core/software/pupil-player/#head-pose-tracking

user-16e6e3 18 June, 2021, 16:14:25

Thanks! Yes, we are aware of the drift error. Unfortunetely, using the Head Pose tracker does not seem to work for our setup. In the experiment, subjects are seated in front a wall upon which images a projected and are told to freely explore the images using head and eye movements. When we looked at the data available with the Head Pose tracker, there wasn't much variability on the x axis, although subjects definitely moved their head left and right quite a lot. So we thought that because they are not really moving in a 3D environment, IMU data might be more suitable?

nmt 18 June, 2021, 17:56:54

How did you calculate degrees of rotation about the yaw axis from the output of the head pose tracker (i.e. rodrigues rotation vector)?

user-16e6e3 18 June, 2021, 19:34:31

We just looked at how rotation_x values from the .csv output varied over time without calculating anything specific.

user-b10192 21 June, 2021, 02:25:33

Hello there! 1- We are still using pupil player version 3.1 to develop/coding our own research. Could you please guide me how we can calculate pupil_time and world_index from recorded videos? Is there any open source code on Pupil-lab repository which is relevant to? 2- We have already exported the sensor timestamp, but is it not matched the pupil_time from pupil_positions.csv. How could we match or convert into sensor timestamp to pupil_time (or) may be world_index? 3- I have installed the latest version of pupil player version 3.3.0. I am wondering how can I have access to blink detector and and pupil position data?

marc 21 June, 2021, 07:38:10

I am still not sure I understand you setup correctly. You are using Pupil Invisible. right? Pupil data is not available for Pupil Invisible and using Pupil Player to detect pupils will yield low quality results. Accordingly, pupil position data can not be used for Pupil Invisible recordings in Pupil Player. The blink detection algorithm is internally based on pupil detection as well and can thus also not be used with Pupil Invisible.

Regarding the confusion with timestamps, which timestamps do you consider the "sensor timestamps"? The gaze data (and if your force it to be generated also the pupil data) will carry the timestamps of the eye camera frames they were computed with. The world camera frames carry independent timestamps. Also, note that Pupil Player converts timestamps on importing recordings. The timestamps in raw Pupil Invisible recordings are nnnanosecond UTC timestamps, on import they become seconds since world video start.

user-359878 30 June, 2021, 12:58:18

Hi @marc, I have a related timestamps question, using the pupil invisible tracker. If the eye camera frames' timestamps carry to the gaze data, and the eye camera is at 200Hz whereas you can get 66Hz from the scene/world camera, that seems to imply that one of three eye camera frames were used to compute gaze? Is that correct? If so, is which eye camera frame gets used a consistent thing (e.g. always the middle frame), or is it somehow averaged across the three eye camera frames (seems unlikely but...possible!) or does it depend on eye camera image/frame quality (selecting which of the three frames is optimal and taking that timestamp)? For our purposes, we need to handle temporal precision carefully to get eye dynamics measures out validly - so if a gaze sample has an implied temporal imprecision of a 3-sample-wide-window (15ms) on those raw gaze positions, we need to know that and take account in event detection. Thanks!

user-16e6e3 21 June, 2021, 08:48:16

@nmt to continue our conversation from Friday, do you have any feedback on that? 🙂 I just realized it was rotation_y values, since we are interested in horizontal head movements (left-right). But those did not vary a lot either.

nmt 21 June, 2021, 09:08:49

Hi @user-16e6e3! If you want to examine whether the glasses wearer rotated their head from left to right, just looking at the y component of the rotation vector provided by the head pose tracker won’t be sufficient. In simple terms, the rotation output of the head pose tracker is a compact representation of Rodrigues notation, which contains information that describes a rotation angle about a given vector. This is different from 3D angles expressed as three consecutive rotations about the x, y, z axes (e.g. given between 0-360 degrees). The latter is certainly more intuitive, and I would therefore recommend that you convert the output of the tracker to radians or degrees. This is reasonable straightforward. Are you able to implement some basic Python scripts?

user-16e6e3 21 June, 2021, 09:47:39

@nmt Thank you! Okay, I see. We'll convert the output to radians or Euler angles then, and see if the data makes more sense. We use R and Matlab, but can also try Python if you would recommend using it instead or have a script for the conversion already.

nmt 21 June, 2021, 12:53:18

You can do this using OpenCV Python bindings. The first step is to convert each rotation vector to a rotation matrix using the OpenCV Rodrigues function. The second step is to convert each rotation matrix to radians (and degrees if required).

This example shows how to convert a rotation matrix to radians, which essentially replicates the Matlab rotm2euler function, but the order of x and z are swapped: https://learnopencv.com/rotation-matrix-to-euler-angles/

Here is an example implementation in Python:

import cv2
import numpy as np

# example rotation vector for input = [rotation_x, rotation_y, rotation_z]
rot_vec = [1.99795496463776, 2.10739278793335, 0.527723252773285]

def rod_to_euler(rot_vec):
    # convert rotation vector to rotation matrix
    rot_vec = np.asarray(rot_vec, dtype=np.float64)
    rot_mat = cv2.Rodrigues(rot_vec)[0]

    # convert rotation matrix to radians
    sin_y = np.sqrt(rot_mat[0, 0] * rot_mat[0, 0] +  rot_mat[1, 0] * rot_mat[1, 0])
    singular = sin_y < 1e-6

    if not singular:
        x = np.arctan2(rot_mat[2,1] , rot_mat[2,2])
        y = np.arctan2(-rot_mat[2,0], sin_y)
        z = np.arctan2(rot_mat[1,0], rot_mat[0,0])
    else:
        x = np.arctan2(-rot_mat[1,2], rot_mat[1,1])
        y = np.arctan2(-rot_mat[2,0], sin_y)
        z = 0

    return np.rad2deg(np.array([y, x, z]))

user-16e6e3 21 June, 2021, 15:44:06

Okay, we'll try 😅 thank you!

nmt 21 June, 2021, 16:10:18

Let us know how you get on. I'd definitely recommend this method over integration of gryo data without drift correction!

user-df9629 22 June, 2021, 14:59:10

Hello @papr , Pupil Player's output (version 3.3.0) exports the marker_detections in .pldata format instead of a .csv. Any library I can use to read the file? Thank you!

papr 22 June, 2021, 15:03:51

The marker detections are exported as csv if you run the export. The pldata file is just a cache file. It contains the same data though

user-df9629 22 June, 2021, 15:02:09

No worries. I found the function in shared_modules.

user-df9629 22 June, 2021, 15:05:06

The Raw Data Exporter, right? I did use it to export 😐

papr 22 June, 2021, 15:05:20

No, it is part of the surface tracker

user-df9629 22 June, 2021, 15:05:56

Oh okay. I am not using that. I will try it

user-3b5a61 23 June, 2021, 00:48:00

Hi guys, I have a few questions regarding the glasses. Today I have a pupil core and everything works great using pupil capture, etc. However, I'm considering upgrading to pupil invisible. My questions are: Using pupil invisible, can I (i) use pupil player to analyse data? 2- use pupil cloud to provide raw data just like pupil player (fixations, blinks, etc.)?

marc 23 June, 2021, 19:47:09

Hi @user-3b5a61! Yes, you can analyse Pupil Invisible recordings in Pupil Player, however not all functionality is available. Since the gaze estimation pipeline of Pupil Invisible is not based on pupil detection, functionality based on it is missing. This includes pupillometry data and blink detection. At the same time, some additional functionality is only available for Pupil Invisible within Pupil Cloud, like the Reference Image Mapper. Fixation detection is not yet available, but will be within a few weeks. We are working on blink detection for Pupil Invisible as well and I expect that to become available in Q3. All the available data can be accessed raw in convenient formats exported from Player or Cloud.

user-3b5a61 23 June, 2021, 21:48:57

Excellent, @marc! Thank you!

user-df9629 24 June, 2021, 20:06:07

The peaks are off by about 0.5 seconds. Any insights on why this could be happening?

papr 24 June, 2021, 20:17:03

Looking at the plugin code, it looks like it discards timestamps from the beginning if there are more timestamps than values. https://github.com/pupil-labs/pupil/pull/2151/files#diff-c842b9c578b951d78da479174dd5d0cf32fa40171e804d5f9c72e023cba499afR245-R251 I am not 100% sure if this is correct. It would at least explain why there is an offset. If you change line 251 in your user plugin like this

- self.ts = self.ts[num_ts_during_init:]
+ self.ts = self.ts[:-num_ts_during_init]

does the offset still exist?

papr 25 June, 2021, 09:20:13

Also, could you share the used imu data and timestamp files with [email removed] I would like to check if the data has more timestamps than imu values.

user-df9629 25 June, 2021, 14:48:12

Sure. Sending it right away!

user-b14f98 27 June, 2021, 15:58:25

Hey folks - getting recording errors on my One+8 when reconnecting the camera after it has been disconnected. Any settings I should double check?

papr 27 June, 2021, 16:01:10

Usually, the error message should let you know what to do. If it does not, or you are unsure on how to proceed, please send an email with details (what the error says, etc) and the android.log.zip file (not 100% about the naming here) to info@pupil-labs.com

papr 30 June, 2021, 13:01:18

Hi, the gaze estimation runs as quickly as possible and always uses the most recent eye video frames for inference. I suggest using the densified 200hz gaze signal from Cloud if you need more consistent gaze estimation frame rate. If you need a fixed sampling rate, one option would be to fit a linear interpolation function and sampling it at evenly spaced timepoints.

user-bbf437 06 July, 2021, 10:18:35

Hi @papr, you mentioned Cloud offers densified 200Hz gaze signal, and since we cannot upload data to the Cloud, due to research protocols, as @user-359878 explained. Would it be an impossible thing to ask if we may use the code for densifying the gaze data on my local laptop, if it is available open source somewhere? May I ask how it is done - densifying gaze data from 200hz, would it be wrong to guess you ran the deep learning inference using all eye camera frames, without the restraint by mobile resources?

user-359878 30 June, 2021, 13:13:39

Hi again @marc , @papr We can't use cloud as we have to comply with data handling protocols for secure academic research, unfortunately. However, we definitely don't want to assume regular sampling when it's not regular - as it would destroy all eye dynamics and velocity measures. What we need to know is what the temporal imprecision is actually - rather than interpolate through loss. We need the gaze coordinates to match whatever the exact time the eye camera frame used as the basis for gaze estimation was sampled, in other words. It seems from your response above that it could be any one of three(?) eye camera frames giving the time stamp to the gaze data, or? Loss is not a big issue for us - imprecision and inaccuracy in the timestamps is a bigger issue - also for valid event detection. For this reason, linear interpolation is also not the right approach for us. If you interpolate linearly through loss of spatial coordinates, you're going to always get a flat velocity profile which is not possible for the eye and will also mess up event detection, potentially. We're OK with loss so long as time and gaze position match the reality of the eye behaviour, so we just need to clarify those uncertainties so we can implement the best data handling for our purposes. Thanks!

papr 30 June, 2021, 13:59:45

Hi, thanks for providing more context. The realtime inferred gaze datum inherits the timestamp of the left eye video frame that was used for inference. The eye timestamp is measured at reception time of the frame data on the android device. There is ~30ms delay between frame exposure and reception which is not corrected for.

user-359878 30 June, 2021, 14:15:18

A PS - it's not so important for us that sampling is absolutely regular - we can handle irregular sampling so long as the time and gaze coordinates are correctly temporally coupled - i.e. so long as the irregular sampling is correctly timestamped and any lag or delay measured.

user-359878 30 June, 2021, 14:12:53

OK thanks for clarifying - so it is 1. one single eye camera frame that is used to estimate gaze, and that eye camera frame's timestamp for left eye is used as the timestamp inherited by both left and right eye gaze coordinates? Is that correct? and 2. there is a non-stable ~30ms delay between sampling and writing the time stamp, and that delay is included as inaccuracy in the gaze timestamps? I guess that leads to 3. how stable is the 30ms delay....if it's software rather than hardware driven temporal imprecision I guess this could vary a lot? Thanks!

papr 30 June, 2021, 14:16:01

The latest frames from both eye cameras are used to infer one gaze location. But only the timestamp of the left eye frame is inherited to the gaze datum. 2./3. I cannot quantify the variance of the delay at the moment.

user-359878 30 June, 2021, 14:24:52

OK. I think I can reasonably estimate the delay based on the calculus of rotations, but it's not ideal - bit reverse engineered. Those delays would definitely cause velocity profiles to veer significantly from realistic eye movement, though. 30ms is a very significant delay, given most saccades would happen within that timeframe, and effect sizes of e.g. fixation duration differences are not huge. It would be really useful to be able to quantify lag somehow, perhaps as the ioHub does in psychopy for integrated trackers? Is there any group working on this delay issue that we could join/contribute to?

papr 30 June, 2021, 14:27:42

The lag itself should not be an issue for you as long as it is consistent. The variance should be much smaller than 30ms!

user-359878 30 June, 2021, 14:28:05

Yes that consistency was part of the question - and is what I'm calling 'temporal precision', whereas the 30ms delay I would term 'temporal accuracy', if that makes sense - is that 30ms consistent, or is that 'unknown' right now? There are many possibilities each of which imply a different optimal processing and event detection. For example, if that 30ms delay is because it's 3 eye camera frames, perhaps alternating image processing between left and right eye, and one of those frames from left eye contributes timestamps to gaze data, this implies a very different optimal approach to event detection than if it's an averaged gaze position from image processing of all three images with an averaged timestamp, or one timestamp....lots of ms in the mix, all of which could be handled but not if we don't know the ground truth...

papr 30 June, 2021, 14:39:56

30ms delay is because it's 3 eye camera frames The 30ms are per video frame. It is totally independent of the frame/processing rate.

alternating image processing between left and right eye There is not alternating image processing. The network always processes two eye images.

calculus of rotations Please keep in mind that the gaze direction is a cyclopian ray estimated in scene camera coordinates using the scene camera as its origin. PI does not estimate gaze directions with the eyes as origin.

user-359878 30 June, 2021, 14:42:07

OK so it's 3 eye camera frames, as I thought, which frame contributes it's timestamp to the gaze data? Or is the timestamp averaged across those 3 eye camera frames? Thanks.

papr 30 June, 2021, 14:45:37

I do not understand how you inferred this from my message. The gaze estimation pipeline is skipping frames if it is not able to estimate gaze quick enough.

For clarification: this was a citation from your previous message

30ms delay is because it's 3 eye camera frames

user-359878 30 June, 2021, 14:59:53

OK then I also didn't understand your answer 🙂 Will try to clarify, thanks for your responses. Which eye camera frame passes on time stamps to the gaze data? When you say 'most recent', what does that mean? Does it mean that if there is gaze data in the last eye camera sample, that's the timestamp that's inherited by the gaze data, but if e.g. two eye camera frames do not produce a gaze coordinate, the timestamp of the valid samples only are inherited? The eye camera is sampling at 200Hz so there's clearly several samples per world camera-based gaze position data. To put it as clearly as possible - if the timestamp could come from any one of a number of consecutive left eye camera samples, then this greatly increases the temporal imprecision beyond what is required by processing alone. If we know it consistently takes the timestamp from the particular frame of the eye camera used to calculate gaze, and neither gaze position nor time is averaged across eye camera frames, then we know that the timestamp and gaze coordinates are temporally coupled even if sample rate is variable. If there's averaging of gaze position over several frames and this averaging is not also applied to the timestamps, this is a source of imprecision too - and not small imprecision - up to 20ms. So - just want to clarify how the timestamps relate to the gaze position data, and whether any offset is consistent. It seems there's several potential sources of temporal imprecision if I've understood you correctly - loss of frames is not a problem for us if the timestamp of the frame that is reported is correct or only effected by a consistent delay of 30ms we can translate for.

marc 30 June, 2021, 15:06:46

@user-359878 A few comments also from my side: - Would your policies allow you to temporarily only upload the eye video data (so no scene video) to calculate the 200 Hz signal? - Please note that the gaze estimates are made independently frame-by-frame, but on top of that there is a 1Euro filter that slightly smooths the signal. Given your temporal precision requirements this might make a difference for you.

user-359878 30 June, 2021, 15:08:09

OK - then that smoothing is also an 'unknown' from my side - but first, it would be good to clarify the points above...which would be a bigger problem than some light smoothing on spatial coordinates, potentially. It's very important to know which of several eye camera frames contributing to the gaze coordinates contributes the timestamps? Imagine you're in a saccade. You're moving fast for about 40ms, with top speed around the middle of the time. Averaging those gaze coordinates across several eye camera frames will give a gaze coordinate somewhere around the middle of the movement. If the timestamp inherited is also averaged, this is OK - it's consistent with the actual behaviour. If, on the other hand, the timestamps are inherited from the most recent eye camera frame, then the gaze coordinates and timestamps will not be consistent with the actual behaviour. Does this make sense? It's quite possible the answer is simply - we always report gaze coordinates from a single frame and that frame is stamped with the eye camera time. But, if gaze is averaged across eye camera frames, we need to average the time stamps too or they will be out of sync.

papr 30 June, 2021, 15:16:05

ok, let me try to summarize it. There are basically 4 processes going on in parallel: 1. world video reception and recording (~30hz) 2. left eye video reception and recording (~200hz) 3. right eye video reception and recording (~200hz) 4. gaze inference (~66hz)

Processes 1-3 run independently of each other, but generate synchronized timestamps as they all use the same clock.

P4 asks P2 and P3 for their last processed frame, passes the 2 images to the gaze estimation network, resulting in one gaze datum which inherits the timestamp from the left eye frame. When it is done, it goes back to P2 and P3 for their newest frames.

Note, as you can see, gaze estimation is 100% independent of P1.

The 1eur filter only smoothes spatially, not temporally.

No timestamps are averaged.

Regarding the 30 ms delay: This affects P1-P3. The camera starts exposing the image at T0 and sends the image via USB to the phone. The software receives the frame data at T1. T1 is used to timestamp the frame. The T1-T0 difference is the 30ms delay.

P1-P3 result in video and time files with the full sampling rate. P4 results in gaze position and time files, whose timestamps are a subset of the timestamps recorded in P2.

user-359878 30 June, 2021, 15:25:46

Thanks! When you say: "P4 results in gaze position and time files, who timestamps are a subset of the timestamps recorded in P2." Right. Which of the potential timestamps recorded in P2 are inherited as gaze timestamps? Is this consistent, or variable? If you have three valid eye camera frames and resulting gaze coordinates since the last gaze data was written, do you just take the most recent one available, or all three to compute gaze position? If you just take the most recent one and essentially dump the previous two, does the gaze data also get the timestamp from the most recent eye camera frame used to calculate gaze? Apologies if this seems like asking the same question over and over - I am! but maybe not clearly enough - hope this helps.

papr 30 June, 2021, 15:31:24

If you just take the most recent one and essentially dump the previous two correct, this is what I meant by skipping frames, yes!

does the gaze data also get the timestamp from the most recent eye camera frame used to calculate gaze? Correct!

Apologies if this seems like asking the same question over and over No worries at all! Happy to help building the correct model of how the pipeline works.

user-359878 30 June, 2021, 15:38:15

OK! Phew - so - to conclude - the gaze data is from a single eye camera frame, with the eye camera timestamp attached. No averaging of spatial or temporal data occurs - though a light smoothing on spatial samples is done at some point after the data is called. The sampling may be temporally inconsistent, but that inconsistency will always be faithfully represented in the timestamps of gaze data. If all that is true, I've the info I need, thanks a lot! BUT - what smoothing is done and can I turn it off if I want to? 🙂 Is it this one? https://hal.inria.fr/hal-00670496/document

papr 30 June, 2021, 15:58:46

This is the correct paper. Comparing distances in meter/cm etc is kind of difficult to compare to eye movements. A more meaningful unit would be degrees within the scene cameras field of view

papr 30 June, 2021, 15:43:03

the gaze data is from a single eye camera frame Not 100% correct. It is based on two frames, one from each eye camera. But with the timestamp of the left camera frame attached.

The spatial 1eur filter is already applied in the recorded gaze data and cannot be turned off AFAIK

user-359878 30 June, 2021, 15:56:12

Hrm. So, based on this chart from the 1euro filter article linked above, which looks at spatial inaccuracy of mouse position according to different 1euro filter adaptive thresholds, we can see that high speeds tend to lead to about 2cm offset spatially. Given the eye moves a lot quicker than the hand, this will be larger in gaze data. Is my thinking correct here? Would be good to be able to turn off filtering so we can test the experiment. We need to ascertain which objects in the real world are looked at and how gaze relates temporally to hand movements to pick them up.

papr 30 June, 2021, 15:59:29

Have you read the PI whitepaper already?

user-359878 30 June, 2021, 15:59:35

Agreed, but we don't have that reported so this is analogous to distance in the scene camera image at lower (hand not eye) velocities. If I've understood the pipeline, then this 1euro filter will cause increased spatial inaccuracy when the eye is in motion (saccade or smooth pursuit) as a result of prioritising low lag, but will sacrifice temporal accuracy (allow lag) for better spatial accuracy during fixations, and how much is very testable. Does this seem correct?

papr 30 June, 2021, 16:07:59

If I understood it correctly, high frequencies in spatial are removed if there is the change is below a specific threshold. So fixations look visually more stable.

user-359878 30 June, 2021, 16:14:49

Well, we are also working on this in our lab, and so I repeat that if you have any open source groups working on this, we'd like to join. There's not much we can do with unknowns - we can control for / adapt to whatever we can measure, but not to unknowns. For that, all we can do is identify which periods are consistent will real eye movements, and which periods are not and should probably be removed from analysis. We do this based on known bounds of eyeball rotation and dual recording with high-end trackers.

papr 30 June, 2021, 16:09:07

Unfortunately, my expertise in this regard is limited. I can only say that we are investigating this topic (in a lot of detail) as part of our effort to improve the product.

papr 30 June, 2021, 16:15:51

Then again, please keep in mind, that PI's gaze direction is not rotating around the eye center but the scene camera origin. i.e. "known bounds of eyeball rotation" might not apply

user-359878 30 June, 2021, 16:17:37

They definitely apply - there are maximum velocities the eye can rotate, for example, and there can also never be a flat velocity profile - if there is, then you know data has been linearly interpolated through loss and is therefore unreliable in terms of eye dynamics. We also can't have any instantaneous change in either acceleration or direction of movement - that's also a sign of non-human data...plus several other things that we can consider 'ground truth', but these details are beyond the original question.

papr 30 June, 2021, 16:19:27

I am just saying that an eye rotation of x degrees does not necessarily correspond to a rotation of x degrees in the estimate gaze signal

papr 30 June, 2021, 16:21:42

I agree that using physiological knowledge is extremely helpful for filtering data "non-human" data.

user-359878 30 June, 2021, 16:23:37

Yes - this is the way to handle noisy data correctly - with priors from the physiological reality. For us, it's a case of quantifying error in when a person looks at a particular object, since it may be a very short interval between fixation on and hand movement toward. The 1euro filter poses data quality concerns in this regard, if I've understood you.

papr 30 June, 2021, 16:26:01

It seems like you are mostly interested in variance of the spatial signal, not the bias. Am I correct, and if so, why? Shouldn't be bias a greater concern for you?

papr 30 June, 2021, 16:36:23

I tried to estimate the delay variance by simply looking at eye video timestamp differences. The differences vary with a std of ~0.15ms. (please ignore the funky x-axis label)

papr 30 June, 2021, 16:46:19

Looking at the absolute time differences, you can see two peaks due to dropped frames. You get the first histogram by subtracting 4 from the left cluster, and 8 from the right cluster. (clusters split at 6ms difference)

user-359878 30 June, 2021, 16:49:22

We are interested in both, of course! But we also have our own calibration techniques to try out, we will see if we can improve on the deep learning one or if the deep learning one outperforms other methods. Both temporal and spatial precision are equally important for correct event detection - either being off leads to distortions in the temporal dynamics, and which is more reliable should influence the correct choice of event detection algorithm. The 1euro filter is an added complication since this introduces further unknowns we can't control for. In eye movement research, effect sizes for e.g. fixation durations as a measure of mental load are small and would require consistent error across experimental conditions, for example. Consistent error either spatially or temporally can be controlled for, inconsistent error is a much bigger problem, even if the size of the error is smaller. In an experiment that wants to track fixations on objects and their relationship to hand movements, this error will come out either as an inaccurate delay between fixation and hand movement, or that you are looking at another object than the one you're really looking at. Either is problematic and needs careful consideration to figure out what measures are valid. Attached is the temporal precision we recorded from our tracker, both in eye camera timestamps and in scene camera timestamps. I guess the multimodal distribution is loss - but this didn't make total sense as the eye camera is at 200Hz and these loss distributions don't match that framerate very well. I suspect that's the 1euro filter adding inconsistent lag/temporal error? That is - if you say that only one eye camera frame is ever used to calculate gaze position. Some kind of clustering of eye data across eye camera frames would also be a potential explanation, as would processing capacity....and many other things. Clarifying is the aim of this thread 🙂

papr 30 June, 2021, 16:50:23

I recognize the graph 🙂 Happy to see that you followed my recommendation to join the discord server. 👍

user-359878 30 June, 2021, 16:50:54

Our programmer sent this to you, I'm the senior researcher, joining in to clarify 🙂

user-359878 30 June, 2021, 16:51:44

'isi' is 'intersample interval', calculated for each timestamp simply by subtracting the previous timestamp in whatever units they are given.

papr 30 June, 2021, 16:55:34

Yeah, that is what I have done in my graphs above, as well. I used a 20-minute long recording with natural viewing behavior (sitting in front of a desk, programming)

As discussed above, the multiple modes in the distributions come from skipped frames.

user-359878 30 June, 2021, 17:01:54

Well - not quite - if what you say above is true? Because you can clearly see that the first distribution from the gaze signal is around 4ms, the second is under10ms, above 15ms etc - at 66 Hz they should be around 16ms apart without loss, and more with loss - If you are using a single pair of eye camera frames with a single timestamp that isn't averaged to write gaze. In the eye camera distribution, you can see two distributions centered around 3.2ms and 8ms. If it's sampling at 200Hz they should be 5ms apart without loss, and 10 with a lost sample etc. This is where the confusion lies - the data doesn't look consistent with a single frame being used for gaze estimation with only loss contributing isi variance, or with 66Hz or 200Hz eye data, even taking into account imprecision/slight variance you see around these distributions. Clearly neither 66Hz or 200Hz sampling explains these distributions of intersample temporal intevals - though perhaps if the 1euro filter is adding lag, this might explain things? Otherwise a full buffer dumping out data inconsistently and taking timestamps after processing rather than from eye camera frames would explain?? Or you may have a better explanation?

papr 30 June, 2021, 17:32:29

I agree that the modes being at <5 and <10ms is unexpected. I will look into that. I can tell you that the 1 eur filter does not change the timestamp and that the timestamp is recorded as soon as the frame reaches the phone.

user-359878 30 June, 2021, 17:33:10

Wait - the gaze timestamps are from the frame of the eye camera, or the time the frame reaches the phone? These are two different things! These details are the problem - not that we have a problem with there being error - but that we have a problem in not knowing whether the error is consistent or not, and measuring it.

papr 30 June, 2021, 17:34:25

The recorded timestamp is measured as soon as the frame reaches the phone. As I mentioned above:

The camera starts exposing the image at T0 and sends the image via USB to the phone. The software receives the frame data at T1. T1 is used to timestamp the frame. The T1-T0 difference is the 30ms delay.

user-359878 30 June, 2021, 17:36:48

I get the 30ms delay (temporal inaccuracy - not a big issue since it's fairly consistent - please confirm - can be corrected as in ioHub, too, which takes account of time recorded vs time written explicitly for this kind of problem) - but that's not what we're seeing above (inconsistent temporal precision - large - potentially a problem especially if it is not measured/knowable, perhaps due to 1euro filter or processing not accounted for), correct? We're seeing inconsistent sampling which does not match either eye camera or scene camera sample/frame rate, and lost samples can not explain. It's big error - not small error one would usually see as a result of sampling - one distribution around the sample rate would be expected, 7 distributions, none of which match the sampling rate in intersample intervals, is not expected. We're trying to ascertain the source of this sampling inconsistency in order to detect fixations and saccades accurately, or at least with error significantly smaller than typical eye movement effect sizes.

papr 30 June, 2021, 18:07:04

not a big issue since it's fairly consistent - please confirm This is what I tried to do here https://discord.com/channels/285728493612957698/633564003846717444/859834766454816788

can be corrected as in ioHub, too, which takes account of time recorded vs time written I do not see the relevance for "time written". But I will have to look into what ioHub is correcting for exactly.

Generally it makes sense to see multiple modes given that frames can be skipped. It makes also sense since they are evenly spaced. The only thing I am unsure about is the location of the modes. I will look into that and come back to you in this regard. If the cameras would run at 250 Hz, the 4 ms differences would make perfect sense.

That said, this analysis is purely based on frame timestamps and does therefore not say anything about the temporal accuracy of the gaze signal. As you said, the 1eur filter could introduce an additional error.

user-359878 30 June, 2021, 18:13:54

Great - then if we understand each other I will just wait for your considered response on the timestamps and their relationship to the gaze signal, when you've had a chance to look into it.

Re this: "That said, this analysis is purely based on frame timestamps and does therefore not say anything about the temporal accuracy of the gaze signal. As you said, the 1eur filter could introduce an additional error." - but what is the temporal accuracy/precision of the gaze data if not these timestamps? Is there any other marker of the temporal progression of the gaze signal? It's not gaze signal without that temporal component!

If it is actually 250Hz this would make sense if the eye camera timestamps were 4ms apart, but not the gaze timestamps which must be somehow fit or averaged onto the world camera frames at 66Hz(?) or around 16ms intervals. The main concern is whether the exact timestamp given is consistent with regard to the eye behaviour/gaze position data. It can be inaccurate - e.g. offset to 30ms - and that's correctable once we know. It can also be imprecise due to processing load variance while calculating and writing gaze position data/making it available to listening software, that's somewhat of a problem but only a big one if this temporal inaccuracy is not measured and not consistent with regard to the real-time eye position data. Thanks!

papr 30 June, 2021, 18:24:58

if the eye camera timestamps were 4ms apart, but not the gaze timestamps which must be somehow fit or averaged onto the world camera frames at 66Hz(?) or around 16ms intervals This is not correct. Since the gaze timestamps are inherited from the left eye camera, their differences must appear on the same modals.

Gaze can be estimated in scene camera coordinates without a scene camera even attached.

Our strategy for visualizing gaze in the scene video is to find the gaze sample that is closest to the scene video frame in time.

papr 30 June, 2021, 18:32:00

Generally, I feel like we are not 100% using the same terminology yet. And my apologies, it is very much possible that I am using the wrong terminology in some cases. 🙂

user-359878 30 June, 2021, 18:34:53

If you are writing timestamps from eye camera frames at a lower sample rate by skipping samples that are not the most recent - then you ARE fitting the timestamps from eye camera to the world camera. It's a simple fit, but it is a fit. If you are just saying 'closest sample', without recording/measuring whether it's the one before, two before, one after the gaze/world camera frame, then this is yet another potential source of temporal imprecision. We can produce better data from the eye movement signal if these potential sources of error are known and considered for event detection.

Gaze can be estimated without a stimulus plane of any sort - sure! But it will still have a timestamp that is based on the framerate of the eye camera, so the exact same temporal precision and accuracy questions would apply. Here, as I understand from everything you've said above, there are a number of complications to this simple scenario, including: - The eye camera frame rate is either 200Hz or 250Hz... - Any of a number of eye camera frames may be used to calculate gaze on the world camera frame - three or four eye camera frames, each producing different gaze coordinates, occur for every one gaze/world camera frame. Loss, or something unknown, is causing that to have inconsistent sampling reported/imprecise timestamps, apart from the inaccuracy of 30ms - that inaccuracy, if consistent, is not a problem. - The use of a 1euro filter without reporting unfiltered data potentially introduces inconsistent temporal precision too - since the nature of such a filter is to trade-off lag and spatial accuracy, adaptively. - other things which seem like too small to mention until above is clarified 😉

No worries with terminology - I'm very used to working interdisciplinary and I think I understand your terms, even if I am used to using standard eye tracking ones 🙂 I may be off in the computer-science terms.

papr 30 June, 2021, 19:24:52

Well, my response got a bit longer, so I posted it here [email removed] Feel free to respond to any of those using the labels (letter.number)

user-359878 30 June, 2021, 19:05:46

PS - this isn't to say that the 1euro filter isn't a good option - it probably is in many cases - it just has a potential trade off with data quality which affects which measures can be validly inferred. In interaction, if there's a good time to have worst lag, it probably is during saccades when the movement is faster as you don't uptake visual information during saccades so - if this were in use in a VR - that would make sense, keep lag away from fixations and prioritise spatial accuracy during fixations. Avoid lag also during smooth pursuit though - there's a potential physiologically based lower bound that could be useful there. All reasonable error can be worked around, if known - we want to estimate what effects are detectable in terms of fixations, saccades etc., so then we need to have a sense of what temporal error we're working with, and correct whatever is correctable while working around what remains unknown. When differences in e.g. fixation durations are often a matter of tens of milliseconds across experimental conditions, this uncertainty of temporal data starts to influence what measures to take, and hence what kind of experiment or test would answer the research question. Thanks a lot!

papr 30 June, 2021, 19:28:44

I tried to separate the issues by topic as well as possible (following my mental model of the problem). Let me know if you think that problems can be further separated or actually would belong together based on your own mental model.

user-359878 30 June, 2021, 19:32:34

Great post! Thanks - and yes this is a lot clearer now - the problem is defined, even if some mystery remains to be tested. If the tracker will record monocularly, I could record with an artificial eye and send the data, and also calculate spatial precision (system-noise, if the tracker and artificial eye is perfectly stable), temporal precision. Happy to share that data if useful. In general, I'm all in favour of smart processing to streamline pipeline, but since data quality needs will differ with the measures of interest, I'm very much in favour of making said smart tools selectable. A lot of the great value of eye movement data goes beyond knowing where someone is looking - i.e. spatial accuracy - and into how the eye is moving, temporal dynamics etc., as more cognitive measures of attention and intention. I look forward to your response!

papr 30 June, 2021, 20:18:19

Follow up: I cleaned up my isi code and put it here for your reference https://nbviewer.jupyter.org/gist/papr/aa376b50909130c772b1608c51f827cd

user-359878 30 June, 2021, 20:25:02

Thanks - I think plotting sample velocity over time against those isi's will also be telling - you would see whether the 1euro filter is trading accuracy for lag in saccade or smooth pursuit? I would do that while recording some fixate saccade and smooth pursuit data - e.g. by watching your thumb wave back and forward - anything! Watching traffic. Having different speeds of moving target will show where the threshold change kicks in and then we can see whether it's causing temporal imprecision. For what it's worth, of course! I think we can both see the problem now - though I still don't have enough information to know how to solve it, or if it can be solved. I'll think about it and record some more data. The artificial eye will have no speed, so if the error is due to the 1euro filter then there should be no velocity adaption and hence no temporal imprecision increase during recording...I think...

papr 30 June, 2021, 20:31:46

I think I have a few fitting datasets and will try plotting x/y velocity against isi (will be interesting to see differences between realtime and posthoc 200hz gaze signal). Although, I am still not sure if it is meaningful. I will have to think it through tomorrow. Have a nice evening!

user-359878 30 June, 2021, 20:32:55

Yep, I will also have a think. If the gaze data should be timestamped with the eye camera frame closest to the world camera frame, then those short isi's of e.g. 4 or 8ms should not exist in the gaze data. The option of writing all gaze samples with their respective eye camera timestamps that fall within a window starting and ending halfway between valid world camera frames would be my choice (C1.3 ~ish), from the options you present in the doc, and those timestamps should be reported even for lost samples that don't produce a gaze coordinate. It's better to have loss samples with missing gaze data and only timestamps, and to know how many samples were lost and when, than to leave timestamp precision variable up to an adaptive threshold in a black box and dependent on processing lag, I think? But more data and thinking through later - we'll run some tests of the assumptions you've described here and see what we get. Thanks and have a nice evening too!

papr 01 July, 2021, 08:29:27

C happens pot-hoc / after all video frames and timestamps have been recorded already. From your message, I get the feeling that you are still thinking that gaze is somehow timestamped in dependency to the scene camera. To clarify, it is not.

The general philosophy is: Record everything at the maximum speed possible, timestamp as accurately as possible, and match streams later.

🕶 invisible

🕶 invisible

Year

Month (message count)