DEV Community

Raju Nekadi
Raju Nekadi

Posted on

Few new things in Python which I learned last week.

Chunkify Huge List into Smaller N equal size lists.

In order to backfill data for one of our machine learning pipeline I have to divide the date list into small n list of equal length and distribute them at n GPU cluster.

from datetime import timedelta,date,datetime
start_dt = date(2023,1,1)
end_dt = date(2023,12,31)
cdays = []

while start_dt < end_dt:
    cdays.append(start_dt)
    start_dt += timedelta(days=3)

#print(cdays)

def split(a, n):
    k, m = divmod(len(a), n)
    return (a[i*k+min(i, m):(i+1)*k+min(i+1, m)] for i in range(n))

split_list = list(split(cdays, 15))
print(split_list[0])

#Output
[datetime.date(2023, 1, 1), datetime.date(2023, 1, 4), datetime.date(2023, 1, 7), datetime.date(2023, 1, 10), datetime.date(2023, 1, 13), datetime.date(2023, 1, 16), datetime.date(2023, 1, 19), datetime.date(2023, 1, 22), datetime.date(2023, 1, 25)]

Enter fullscreen mode Exit fullscreen mode

__all__ in Python

The __all__ represents name of variable ,function or module that you want to expose to wildcards import. This method really comes handy when large number of function in your base module and you just want to export only few of them. Example let's say you have.


#foo.py

waz = 3
bar = 9

def baz():
    return 'baz'

__all__ = ['bar','baz']

Enter fullscreen mode Exit fullscreen mode

and now bar.py imports from foo.py

#bar.py

from foo import *

print(bar)

print(baz)

print(waz) # This will trigger the exception , as waz was is not exported by the module.

Enter fullscreen mode Exit fullscreen mode

Conclusion

Hope this was easy and clear to understand.

Some Final Words
If this blog was helpful and you wish to show a little support, you could:

  1. 👍 300 times for this story
  2. Follow me on LinkedIn: https://www.linkedin.com/in/raju-n-203b2115/

These actions really really really help me out, and are much appreciated!

Top comments (1)

Collapse
 
cybersasa profile image
Joseph

Excellent breakdown @rajun !

Using Python to chunk large datasets for GPU processing optimizes machine learning pipelines, and leveraging all keeps your codebase clean and manageable by controlling module exports