keras.utils.PyDataset / tf.keras.utils.Sequence ignoring len different behavior Keras2/Keras3 (tensorflow 2.16) #19994

Pagey · 2024-07-15T20:17:52Z

Hi there - paraphrasing an issue from 2018 :

change is return idx#self.alist[idx] in __getitem__ . this is relevant in cases of generated datasets- i.e. it looks as though __len__ value is ignored and it used not to be?

import tensorflow as tf


class InfiniteGenerator(object):
    def __init__(self, alist):
        self.alist = alist

    def __getitem__(self, idx):
        return idx#self.alist[idx]

    def __len__(self):
        return len(self.alist)

    def __iter__(self):
        for item in (self[i] for i in range(len(self))):
            yield item


class KGen(tf.keras.utils.Sequence):
    def __init__(self, alist):
        self.alist = alist

    def __getitem__(self, idx):
        return idx#self.alist[idx]

    def __len__(self):
        return len(self.alist)


if __name__ ==  '__main__':
    ig = InfiniteGenerator(list(range(4)))
    for item in ig:
        print(item)


    print('now trying second iterator')

    import time
    time.sleep(1)

    kg = KGen(list(range(4)))
    for item in kg:
        print(item)

the above code on tensorflow 2.15 (Python 3.10.13, Ubuntu 20.04) produces this output:

0
1
2
3
now trying second iterator
0
1
2
3

and on tensorflow 2.16 (Python 3.10.13, Ubuntu 20.04) produces this output:

0
1
2
3
now trying second iterator
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
.......

The text was updated successfully, but these errors were encountered:

mehtamansi29 · 2024-07-16T10:51:12Z

Hi @Pagey-

KGen class inherits from tf.keras.utils.Sequence. In tf.keras.utils.Sequence(PyDataset class) implement __getitem__() method should return a complete batch, and the __len__ method should return the number of batches in the dataset. For more details can find from here.

So in KGen class, __getitem__() method return elements from the underlying data. And here self.alist[idx] will return all element of self.alist data while idx return only index. Attached gist for the reference.

Pagey · 2024-07-16T21:35:42Z

Thanks @mehtamansi29 - it looks like you changed the code in the gist between the tensorflow 2.15 and 2.16 versions?
the

def __getitem__(self, idx):
        return idx#self.alist[idx]

is supposed to represent an infinite data generator and thus is not limited to the length of self.alist. It could have just been written there: return np.random.random()

in any case this represents a difference in behavior between the two versions, i.e. one that is terminated after len()/__len__ batches (in tensorflow 2.15) and one that is not (in tensorflow 2.16)

i saw that in the new version method __len__ is replaced by num_batches but it doesn't seem to make a similar effect as was in 2.15 either.
how should one terminate after __len__/num_batches batches in tensorflow 2.16 in case of an infinitely generated set?

thirstythurston · 2024-10-11T12:59:08Z

Hello,

I am seeing the same problem. I was looking at the source for class PyDataset I think I may have found an issue. I first noticed that the number of calls to getitem, more calls are made than len returns. I tried setting up a def num_batches property since it is part of the PyDataset class on lines 157 and 158 of the source code for py_dataset_adapter.py. I might be wrong but when calling that property and supplying a calculation for the number of batches, I get an error:

File "/home/user/miniconda3/envs//lib/python3.12/site-packages/keras/src/trainers/data_adapters/py_dataset_adapter.py", line 288, in get_tf_dataset
num_samples = min(num_samples, num_batches)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<' not supported between instances of 'method' and 'int'

The code for that section is calling

num_batches = self.py_dataset.num_batches
if self._output_signature is None:
num_samples = data_adapter_utils.NUM_BATCHES_FOR_TENSOR_SPEC
if num_batches is not None:
num_samples = min(num_samples, num_batches)

This seems off to me but I may be wrong that the line

num_batches = self.py_dataset.num_batches

is returning the num_batches method which cannot be evaluated by the min statement and should be

num_batches = self.py_dataset.num_batches()

and above at the at line 165 the def num_batches should return

if hasattr(self, "len"):
return self.len()

where it has: return len(self) which should just give an infinite recursion error.

The combination of these bugs might be leading to len being None and causing the generator to be called an infinite number of times.

I've also noticed that the number of calls to the getitem of the PyDataset class, that some indexes are called more than once as if some calls are made, something fails, and then getitem is called again. When I print the number of times getitem has been called, the number of batches delivered to training is always larger than what the progress bar reports when training a model in tensorflow.

I generate my data on the fly, perhaps I need to design my dataset better, My particular issue is that I process the data coming off the disk before sending to training to save system memory, Doing this with multiple workers on the i-o side causes issues where the data is not necessarily associated with a particular index. I get that is an issue on my part but it would be nice if the description of PyDataset made it clear that not all batches of data delivered are used in training.

dhantule · 2025-01-10T04:01:00Z

how should one terminate after __len__/num_batches batches in tensorflow 2.16 in case of an infinitely generated set?

Hi @thirstythurston, thanks for reporting this. You can try to terminate it using

if idx >= self.__len__(): 
     raise StopIteration

Attaching gist for reference. Thanks!

github-actions · 2025-01-25T01:56:56Z

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions · 2025-02-08T01:59:00Z

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler · 2025-02-08T01:59:02Z

Are you satisfied with the resolution of your issue?
Yes
No

github-actions bot assigned mehtamansi29 Jul 15, 2024

mehtamansi29 added type:support User is asking for help / asking an implementation question. Stackoverflow would be better suited. stat:awaiting response from contributor labels Jul 16, 2024

google-ml-butler bot removed the stat:awaiting response from contributor label Jul 16, 2024

doiko mentioned this issue Jul 18, 2024

PyDataset ignores __len__ during iteration, Tensorflow backend #20001

Closed

dhantule added the stat:awaiting response from contributor label Jan 10, 2025

github-actions bot added the stale label Jan 25, 2025

github-actions bot closed this as completed Feb 8, 2025

pure-rgb mentioned this issue Apr 10, 2025

possible bug in keras.utils.Sequence #21151

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring len different behavior Keras2/Keras3 (tensorflow 2.16) #19994

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring len different behavior Keras2/Keras3 (tensorflow 2.16) #19994

Pagey commented Jul 15, 2024 •

edited

Loading

mehtamansi29 commented Jul 16, 2024

Pagey commented Jul 16, 2024 •

edited

Loading

thirstythurston commented Oct 11, 2024

dhantule commented Jan 10, 2025

github-actions bot commented Jan 25, 2025

github-actions bot commented Feb 8, 2025

google-ml-butler bot commented Feb 8, 2025

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring __len__ different behavior Keras2/Keras3 (tensorflow 2.16) #19994

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring __len__ different behavior Keras2/Keras3 (tensorflow 2.16) #19994

Comments

Pagey commented Jul 15, 2024 • edited Loading

mehtamansi29 commented Jul 16, 2024

Pagey commented Jul 16, 2024 • edited Loading

thirstythurston commented Oct 11, 2024

dhantule commented Jan 10, 2025

github-actions bot commented Jan 25, 2025

github-actions bot commented Feb 8, 2025

google-ml-butler bot commented Feb 8, 2025

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring len different behavior Keras2/Keras3 (tensorflow 2.16) #19994

keras.utils.PyDataset / tf.keras.utils.Sequence ignoring len different behavior Keras2/Keras3 (tensorflow 2.16) #19994

Pagey commented Jul 15, 2024 •

edited

Loading

Pagey commented Jul 16, 2024 •

edited

Loading