5 Best Ways to Count Unique Values Per Group in Python Pandas

💡 Problem Formulation: When working with tabular data in Python using the Pandas library, analysts often face the task of counting unique values across different groups. For example, given a DataFrame containing sales data, one might want to find out the number of unique products sold in each region. The input is a DataFrame with at least two columns (grouping column and values column), and the desired output is a Series or DataFrame showing each group with the corresponding count of unique values.

Method 1: Using groupby() and nunique()

This method involves utilizing Pandas’ groupby() method to group the data by a certain criterion, followed by the nunique() method to get the number of unique values within each group. It’s a straightforward approach and is commonly used for its simplicity and readability.

Here’s an example:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products per region
unique_counts = df.groupby('Region')['Product'].nunique()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This code snippet groups the DataFrame by the ‘Region’ column and then applies the nunique() method on the ‘Product’ column to find the number of unique products sold per region. The result is a Series indexed by the unique regions with the counts as values.

Method 2: Using groupby() with agg() and a custom function

For more complex scenarios or when you need to perform additional operations along with counting unique values, the agg() method can be employed. This method allows for the use of custom aggregation functions, including lambda functions, for group-wise operations.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Cherry']
})

# Count unique products per region with a custom function
unique_counts = df.groupby('Region')['Product'].agg(lambda x: len(x.unique()))

print(unique_counts)

Output:

Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.groupby('Region').count()

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

The code groups the DataFrame by the ‘Region’ column and then applies value_counts() to get the frequency distribution of the ‘Product’. The result is then grouped again and counted to get the number of unique products per region. This method is a bit more convoluted and usually not as straightforward as using nunique().

Method 4: Using groupby() and apply() with a set

When working with groupby objects, the apply() method can be useful to apply custom operations per group. In this case, a set can be used to identify unique values, as sets naturally contain only unique items.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Use apply with a set to count unique products
unique_counts = df.groupby('Region')['Product'].apply(lambda x: len(set(x)))

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

This snippet groups the DataFrame by ‘Region’, and for each group, it converts the ‘Product’ column into a set, thereby keeping only the unique values. Then, the length of the set is taken which gives the count of unique values per group. It’s somewhat similar to Method 2 but utilizes sets for the uniqueness property.

Bonus One-Liner Method 5: Using groupby() and pd.Series.unique

For a more concise approach, the unique() method of the pd.Series class can be applied directly within the groupby operation. This method is succinct and performs well for counting unique values.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Count unique products with a one-liner
unique_counts = df.groupby('Region')['Product'].agg(lambda x: x.unique().size)

print(unique_counts)

Output:

Region
East     2
North    1
West     1
Name: Product, dtype: int64

Here, unique() is called on each grouped ‘Product’ Series, and then the size of the resulting unique values array is calculated. This one-liner is handy for quick operations when you need to count unique values elegantly.

Summary/Discussion

  • Method 1: Using groupby() and nunique(). Straightforward and readable. Limited to simple unique value counting.
  • Method 2: Using groupby() with agg() and a custom function. Flexible for complex operations. Slightly less readable due to custom functions.
  • Method 3: Using groupby() with value_counts(). Indirect method for counting unique values. Can be more verbose and less intuitive.
  • Method 4: Using groupby() and apply() with a set. Utilizes the uniqueness property of sets. Good for custom or complex operations, but may be less efficient than other methods.
  • Bonus Method 5: One-liner using groupby() and pd.Series.unique. Very concise and clean. May lack some extensibility for complex scenarios.
Region
East     3
North    1
West     1
Name: Product, dtype: int64

In this code, a lambda function is passed to agg() to count unique ‘Product’ values in each ‘Region’. This approach allows for additional manipulations such as filtering or chaining methods, which can provide more flexibility compared to using nunique() directly.

Method 3: Using groupby() with value_counts()

This method involves the use of the value_counts() function to create a frequency distribution of unique values grouped by a specified column. Although directly it returns counts of each unique value, we can use this to derive the number of unique values indirectly.

Here’s an example:

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame({
    'Region': ['North', 'West', 'North', 'East', 'West', 'East', 'East'],
    'Product': ['Apple', 'Banana', 'Apple', 'Durian', 'Banana', 'Apple', 'Durian']
})

# Calculate frequency distribution and then the number of unique products
product_distribution = df.groupby('Region')['Product'].value_counts()
unique_counts = product_distribution.