Working around the Windows-numpy astype(int) in Pandas

  cross-platform, numpy, pandas, python, windows

I have a codebase I’ve been developing on a Mac (and running on Linux machines) based largely on pandas (and therefore numpy). Very commonly I type-cast with astype(int).

Recently a Windows-based developer joined our team. In an effort to make the code base more platform-independent, we’re trying to gracefully tackle this tricky issue whereby numpy uses a 32-bit type instead of the 64-bit type, which breaks longer integers.

On a Mac, we see:

ipdb> ids.astype(int)
id
1818726176      1818726176  
1881879486      1881879486  
2590366906      2590366906  
284399109       284399109   
299981685       299981685   
370708200       370708200   
387277023371    387277023371
387343898032    387343898032
406885699892    406885699892
5262665206      5262665206  
544687374       544687374   
6978317806      6978317806  

Whereas on a Windows machine (in PowerShell), we see:

ipdb> ids.astype(int)
id
1818726176      1818726176
1881879486      1881879486
2590366906     -1704600390
284399109       284399109 
299981685       299981685 
370708200       370708200 
387277023371    729966731 
387343898032    796841392 
406885699892   -1136193228
5262665206      967697910 
544687374       544687374 
6978317806     -1611616786

Other than using a sed call to change every astype(int) to astype(np.int64) (which would also require an import numpy as np at the top of every module where currently that doesn’t exist), is there a way to do this?

In particular, I was hoping to map int to numpy.int64 somehow in a pandas option or something.

Thank you!

Source: Windows Questions

LEAVE A COMMENT