It is a Python code snippet to generate pattern knowledge for a transaction community graph, after which we’ll proceed to create a primary fraud detection mannequin utilizing the generated knowledge. For simplicity, I’ll use the NetworkX
library to create and manipulate the graph. Let’s begin with the information era:
import random
import pandas as pddef generate_data(num_accounts=100, num_merchants=20, num_transactions=500):
# Generate account holders
accounts = [{'account_id': f'Acc_{i}', 'customer_id': f'Customer_{i}', 'account_type': random.choice(['personal', 'business'])}
for i in vary(1, num_accounts + 1)]
# Generate retailers
retailers = [{'merchant_id': f'Merchant_{i}', 'business_type': random.choice(['Retail', 'Food', 'Services'])}
for i in vary(1, num_merchants + 1)]
# Generate transactions
transactions = [{'transaction_id': f'Transaction_{i}',
'account_id': random.choice(accounts)['account_id'],
'merchant_id': random.selection(retailers)['merchant_id'],
'quantity': spherical(random.uniform(10, 5000), 2),
'timestamp': pd.Timestamp.now() - pd.Timedelta(days=random.randint(1, 365))}
for i in vary(1, num_transactions + 1)]
return pd.DataFrame(accounts), pd.DataFrame(retailers), pd.DataFrame(transactions)
# Generate pattern knowledge
df_accounts, df_merchants, df_transactions = generate_data()
# Displaying pattern dataframes
print("Pattern Account Information:")
print(df_accounts.head())
print("nSample Service provider Information:")
print(df_merchants.head())
print("nSample Transaction Information:")
print(df_transactions.head())
This code will generate pattern knowledge for accounts, retailers, and transactions. Now, let’s create a primary fraud detection mannequin utilizing this knowledge. For simplicity, we’ll simply search for transactions which might be unusually massive in comparison with the account holder’s typical transactions. We’ll assume that such transactions could be indicative of fraud.
import networkx as nx
import matplotlib.pyplot as plt# Create a directed graph
G = nx.DiGraph()
# Add nodes for accounts and retailers
for _, row in df_accounts.iterrows():
G.add_node(row['account_id'], kind='account')
for _, row in df_merchants.iterrows():
G.add_node(row['merchant_id'], kind='service provider')
# Add transaction edges
for _, row in df_transactions.iterrows():
G.add_edge(row['account_id'], row['merchant_id'], quantity=row['amount'], transaction_id=row['transaction_id'])
# Outline a easy fraud detection rule: flag transactions with quantities > 3000
fraudulent_transactions = [row['transaction_id'] for _, row in df_transactions.iterrows() if row['amount'] > 3000]
# Filter out invalid transaction IDs
valid_fraudulent_transactions = [tid for tid in fraudulent_transactions if tid in G.nodes]
# Calculate positions of nodes utilizing a format algorithm
pos = nx.spring_layout(G, seed=42)
# Visualize the graph
plt.determine(figsize=(10, 8))
nx.draw(G, pos, with_labels=True, node_color='lightblue', node_size=500, font_size=10)
nx.draw_networkx_nodes(G, pos, nodelist=valid_fraudulent_transactions, node_color='pink', node_size=500)
nx.draw_networkx_edges(G, pos, width=1.0, alpha=0.5)
plt.title('Transaction Community Graph with Fraud Detection')
plt.present()
# Print fraudulent transactions
print("Fraudulent Transactions:")
print(valid_fraudulent_transactions)
This code snippet creates a directed graph utilizing the NetworkX
library, the place nodes symbolize accounts and retailers, and edges symbolize transactions. It then applies a easy fraud detection rule to establish transactions with quantities better than $3000. Lastly, it visualizes the graph, highlighting fraudulent transactions in pink.