Quantcast
Channel: Question and Answer » arcpy
Viewing all articles
Browse latest Browse all 767

Efficient ways to read table data from a large feature layer?

$
0
0

I have a feature layer that contains about 460,000 records and it currently takes about 20 minutes to read that table using the arcpy.da.TableToNumPyArray() tool. Is there a more efficient way to read the rows to then be able to manipulate these data? It seems like there should be a more “C-like” way to access the rows. Here is the function in it’s entirety, though I’m focused on the line near the bottom where I’m calling arcpy.da.TableToNumpyArray() to read out the data:

def import_surface(surface, grid, nrow, ncol, row_col_fields, field_to_convert):
    """
    References raster surface to model grid.
    Returns an array of size nrow, ncol.
    """
    out_table = r'in_memory{}'.format(surface)
    grid_oid_fieldname = arcpy.Describe(grid).OIDFieldName

    # Compute the mean surface value for each model cell and output a table (~20 seconds)
    arcpy.sa.ZonalStatisticsAsTable(grid, grid_oid_fieldname, surface, out_table, 'DATA', 'MEAN')

    # Build some layers and views
    grid_lyr = r'in_memorygrid_lyr'
    table_vwe = r'in_memorytable_vwe'
    arcpy.MakeFeatureLayer_management(grid, grid_lyr)
    arcpy.MakeTableView_management(out_table, table_vwe)

    grid_lyr_oid_fieldname = arcpy.Describe(grid_lyr).OIDFieldName
    # table_vwe_oid_fieldname = arcpy.Describe(table_vwe).OIDFieldName

    # Join the output zonal stats table with the grid to assign row/col to each value.
    arcpy.AddJoin_management(grid_lyr, grid_lyr_oid_fieldname, table_vwe, 'OID_', 'KEEP_ALL')

    # Take the newly joined grid/zonal stats and read out tuples of (row, col, val) (takes ~20 minutes)   
    a = arcpy.da.TableToNumPyArray(grid_lyr, row_col_fields + [field_to_convert], skip_nulls=False)

    # Reshape the 1D array output by TableToNumpy into a 2D structured array, sorting by row/col (~0.1 seconds)
    a = np.rec.fromrecords(a.tolist(), names=['row', 'col', 'val'])
    a.sort(order=['row', 'col'])
    b = np.reshape(a.val, (nrow, ncol))

    return b

Viewing all articles
Browse latest Browse all 767

Trending Articles