5 Best Ways to Convert Lists of Lists to NumPy Arrays in Python

πŸ’‘ Problem Formulation: In Python, often times data is initially in a format of a list of lists, where each sublist represents a row or a collection of elements. The task is to convert this data structure into a NumPy array for more sophisticated operations, especially for scientific computing. For instance, if you have [[1, 2], [3, 4], [5, 6]] as your input, the desired output would be a 2-D NumPy array of the same elements.

Method 1: Using np.array()

The np.array() function is the direct method to convert a list of lists into a NumPy array. By passing the list of lists to np.array(), NumPy constructs a new n-dimensional array from the data.

Here’s an example:

import numpy as np

list_of_lists = [[1, 2], [3, 4], [5, 6]]
np_array = np.array(list_of_lists)

Output:

array([[1, 2],
       [3, 4],
       [5, 6]])

This method is straightforward and is probably the first one you should try when converting a list of lists to a NumPy array. It works well with data that is already well-structured and expected to form a rectangular array.

Method 2: Using np.asarray()

The np.asarray() function is similar to np.array(), but it does not copy the data if the input is already an array. This can be more memory-efficient if your list of lists might sometimes already be an array.

Here’s an example:

import numpy as np

list_of_lists = [[7, 8], [9, 10], [11, 12]]
np_array = np.asarray(list_of_lists)

Output:

array([[ 7,  8],
       [ 9, 10],
       [11, 12]])

This function is useful if you are working with large datasets and you want to avoid unnecessary copying of data into memory.

Method 3: Using np.vstack()

If your list of lists is not already structured as a proper 2-D list (for instance, if your sublists vary in size), you can use np.vstack() to stack lists vertically, effectively converting them into rows of a NumPy array.

Here’s an example:

import numpy as np

list_of_lists = [[13], [14, 15], [16, 17, 18]]
np_array = np.vstack(list_of_lists)

Output:

array([[13,  0,  0],
       [14, 15,  0],
       [16, 17, 18]])

This method is beneficial when the input lists are of varying lengths, as np.vstack() will fill in the β€œgaps” with zeros. However, note that this requires the prior formatting of the input lists to have the same length.

Method 4: Using np.concatenate()

np.concatenate() allows you to concatenate a sequence of arrays along an existing axis. To work with a list of lists, you would first convert each sublist into an array and then concatenate them.

Here’s an example:

import numpy as np

list_of_lists = [[19, 20], [21, 22], [23, 24]]
arrays = [np.array(l) for l in list_of_lists]
np_array = np.concatenate(arrays, axis=0).reshape(3, 2)

Output:

array([[19, 20],
       [21, 22],
       [23, 24]])

This method gives you more control over the conversion process, especially if you need to concatenate data along a particular axis, but it requires extra steps, like reshaping, to achieve the final desired array structure.

Bonus One-Liner Method 5: Using np.array() with a generator expression

You can use a generator expression inside np.array() to convert each sublist in the list of lists into an array. This can be useful if your list contains complex structures or if you require a condition to be met for each element.

Here’s an example:

import numpy as np

list_of_lists = [[25, 26], [27, 28], [29, 30]]
np_array = np.array((np.array(sublist) for sublist in list_of_lists))

Output:

array([[25, 26],
       [27, 28],
       [29, 30]])

This method compresses the steps into a one-liner. However, the use of generator expressions may be less readable for some users and may not provide a significant advantage over the more straightforward np.array() approach.

Summary/Discussion

  • Method 1: np.array(). Straightforward and easy to use. Best for well-structured lists of lists where a copy of the data is acceptable.
  • Method 2: np.asarray(). Similar to Method 1 but avoids data copying if the input is already an array. Ideal for memory efficiency.
  • Method 3: np.vstack(). Handles lists of varying shapes by filling in with zeros. Requires lists to be reformatted to the same length before stacking.
  • Method 4: np.concatenate(). Offers precise control over axis of concatenation but requires additional steps like reshaping the array.
  • Bonus One-Liner Method 5. A compact approach using a generator expression, which may suit certain scenarios but generally offers no significant benefit over Method 1.