Lab 21: Pedigree Rendering and Visualization
Core Component: This lab explores the pedigree rendering and visualization techniques used in Bonsai v3. These techniques are essential for helping users interpret and understand the results of genetic genealogy analyses. Effective visualization makes complex pedigree structures more accessible and highlights important genetic relationships.
The Importance of Visualization in Genetic Genealogy
Why Visualization Matters
Genetic pedigrees are complex structures that can be difficult to interpret from raw data alone. Effective visualization serves several critical functions:
- Intuitive Understanding: Translates abstract genetic relationships into visually intuitive family structures
- Pattern Recognition: Helps identify patterns and connections that might be missed in tabular data
- Communication: Facilitates sharing and discussing findings with others
- Validation: Provides a way to visually confirm that inferred relationships make biological sense
Bonsai v3 includes sophisticated rendering capabilities that leverage graph theory and visualization principles to create clear, informative representations of complex pedigree structures.
Graph-Based Representation of Pedigrees
Pedigrees as Directed Graphs
In Bonsai v3, pedigrees are naturally represented as directed graphs, where:
- Nodes represent individuals
- Edges represent parent-child relationships
- Direction flows from parent to child
This graph-based representation enables the application of powerful graph algorithms for analyzing family structures and provides a natural basis for visualization.
Up-Node Dictionary: The Foundation of Pedigree Representation
Bonsai v3 uses an "up-node dictionary" as its primary data structure for representing pedigrees. This dictionary maps each individual to their parents:
{ child_id_1: {parent_id_1: degree, parent_id_2: degree}, child_id_2: {parent_id_3: degree, parent_id_4: degree}, ... }
This structure efficiently encodes the directed graph of the pedigree, with each key representing a child node and the values representing the parent nodes.
Example: Simple Family Structure
Consider a simple family with grandparents (1, 2), parents (3, 4), and a child (5). In the up-node dictionary format:
{ 3: {1: 1, 2: 1}, # Individual 3 has parents 1 and 2 4: {}, # Individual 4 has no parents in the pedigree 5: {3: 1, 4: 1} # Individual 5 has parents 3 and 4 }
This compact representation captures the entire family structure and can be easily rendered as a directed graph.
The render_ped Function in Bonsai v3
Core Visualization Function
At the heart of Bonsai's pedigree visualization is the render_ped
function in the rendering.py
module, which converts an up-node dictionary into a graphical representation using the Graphviz library:
def render_ped( up_dct: dict[int, dict[int, int]], name: str, out_dir: str, color_dict=None, label_dict=None, focal_id=None, ): """ Render a pedigree as a directed graph. Args: up_dct: Up-node dictionary mapping individuals to their parents name: Base name for the output file out_dir: Directory to save the rendered image color_dict: Dictionary mapping node IDs to colors label_dict: Dictionary mapping node IDs to labels focal_id: ID of the focal individual to highlight """ dot = graphviz.Digraph(name) all_id_set = get_all_id_set(up_dct) # Set default values if color_dict is None: color_dict = {i: 'dodgerblue' for i in all_id_set if i > 0} if label_dict is None: label_dict = {n: str(n) for n in all_id_set} if focal_id is not None: color_dict[focal_id] = 'red' # Add nodes (individuals) for n in all_id_set: edge_color = None fill_color = color_dict[n] if n in color_dict else None style = 'filled' if n in color_dict else None label = label_dict.get(n, "") dot.node( str(n), color=edge_color, fillcolor=fill_color, style=style, label=label, ) # Add edges (parent-child relationships) for c, pset in up_dct.items(): for p in pset: dot.edge(str(p), str(c), arrowhead='none') # Render the graph plt.clf() dot.render(directory=out_dir).replace('\\', '/')
This function provides a flexible foundation for pedigree visualization, with options to customize colors, labels, and highlight focal individuals.
Customizing Pedigree Visualizations
Beyond Basic Rendering
While Bonsai's render_ped
function provides solid baseline functionality, there are many ways to enhance and customize pedigree visualizations:
1. Node Attributes by Individual Characteristics
Nodes can be customized to represent individual characteristics:
- Color coding by sex (blue for males, pink for females)
- Shape variation by status (rectangles for living, ovals for deceased)
- Border styles for additional attributes (dashed for adopted, dotted for uncertain)
- Size variation for emphasis or to represent additional metrics
Example: Sex-Based Node Styling
# Create dictionaries for customization color_dict = { id_val: 'skyblue' if sex_dict[id_val] == 'M' else 'pink' for id_val in pedigree.keys() } shape_dict = { id_val: 'box' if sex_dict[id_val] == 'M' else 'ellipse' for id_val in pedigree.keys() } # Use in enhanced rendering function dot.node( str(node_id), fillcolor=color_dict.get(node_id, 'white'), shape=shape_dict.get(node_id, 'box'), style='filled' )
2. Edge Styling for Relationship Information
Edges can be styled to convey relationship information:
- Color variation for relationship types or confidence
- Line thickness for relationship closeness or strength of evidence
- Line styles for relationship types (solid for biological, dashed for adoptive)
- Edge labels for additional relationship details
3. Layout Customization
The layout of the pedigree can significantly impact its interpretability:
- Direction settings (top-down, bottom-up, left-right)
- Node spacing for clearer visual separation
- Subgraph clustering for organizing related individuals
- Rank alignment to position individuals by generation
Graphviz Layout Options
Graphviz supports several layout algorithms that can be applied to pedigrees:
- dot: Hierarchical layout ideal for pedigrees with clear generational structure
- neato: Spring model layout useful for pedigrees with many interconnections
- fdp: Force-directed layout good for large pedigrees
- twopi: Radial layout that places focal individuals at the center
Visualizing IBD Sharing in Pedigrees
Integrating Genetic Evidence in Visualizations
One of the most powerful applications of pedigree visualization is the ability to overlay genetic sharing information onto the family structure. This helps users understand how genetic evidence supports the inferred relationships.
IBD Overlay Techniques
- Additional edges between individuals who share DNA
- Edge thickness proportional to amount of shared DNA
- Color gradients to indicate strength of genetic connection
- Edge labels showing total cM and segment counts
Example: Adding IBD Sharing Information
# First, add regular parent-child edges for child, parents in pedigree.items(): for parent in parents: dot.edge( str(parent), str(child), color='black', style='solid', penwidth='1' ) # Then add IBD sharing edges with custom styling for (id1, id2), data in ibd_data.items(): total_cm = data['total_cm'] # Skip if below threshold if total_cm < min_cm: continue # Calculate edge attributes based on total cM # Thicker edges for more sharing penwidth = 0.5 + min(5, total_cm / 500) # Color intensity based on total cM intensity = min(255, int(50 + (total_cm / 3500) * 205)) color = f"#{intensity:02x}00{255-intensity:02x}" # Add the IBD edge dot.edge( str(id1), str(id2), color=color, style='dashed', penwidth=str(penwidth), constraint='false', # Don't use this edge for layout label=f"{total_cm:.1f} cM" )
This technique creates a visual representation that combines the structural information of the pedigree with the genetic evidence supporting those relationships, providing a more complete picture.
Chromosome Painting Visualizations
Visualizing Segment-Level IBD Sharing
Chromosome painting is another important visualization technique in genetic genealogy that complements pedigree diagrams by showing the specific chromosomal segments shared between individuals.
While not directly implemented in Bonsai's rendering module, chromosome painting can be created using matplotlib to provide detailed segment-level information:
Example: Chromosome Painting Implementation
def create_chromosome_painting(individual_id, ibd_data, figsize=(15, 10)): """ Create a chromosome painting visualization for an individual. Args: individual_id: ID of the individual to visualize ibd_data: Dictionary mapping pairs of individuals to IBD sharing data figsize: Figure size (width, height) Returns: Matplotlib figure """ # Extract segments involving the individual segments = [] for (id1, id2), data in ibd_data.items(): if id1 == individual_id or id2 == individual_id: other_id = id2 if id1 == individual_id else id1 for segment in data['segments']: segments.append({ 'chromosome': segment['chromosome'], 'start_pos': segment['start_pos'], 'end_pos': segment['end_pos'], 'cm': segment['cm'], 'other_id': other_id }) # Sort segments by chromosome and position segments.sort(key=lambda s: ( int(s['chromosome']) if s['chromosome'].isdigit() else 999, s['start_pos'] )) # Get the unique chromosomes chromosomes = sorted(set(s['chromosome'] for s in segments), key=lambda x: int(x) if x.isdigit() else 999) # Create figure with one subplot per chromosome fig, axs = plt.subplots(len(chromosomes), 1, figsize=figsize, squeeze=False, sharex=True) axs = axs.flatten() # Create a color map for each unique "other_id" other_ids = sorted(set(s['other_id'] for s in segments)) colors = plt.cm.tab10.colors color_map = {other_id: colors[i % len(colors)] for i, other_id in enumerate(other_ids)} # Draw segments on each chromosome for i, chrom in enumerate(chromosomes): ax = axs[i] chrom_segments = [s for s in segments if s['chromosome'] == chrom] # Draw chromosome backbone ax.plot([0, max(s['end_pos'] for s in chrom_segments)], [0, 0], 'k-', linewidth=2) # Draw segments for segment in chrom_segments: other_id = segment['other_id'] ax.plot( [segment['start_pos'], segment['end_pos']], [0, 0], '-', linewidth=10, color=color_map[other_id], solid_capstyle='butt', alpha=0.7 ) return fig
Chromosome paintings provide complementary information to pedigree diagrams, showing exactly which parts of the genome are shared between individuals. When used together, these visualization techniques offer a comprehensive view of genetic relationships.
Practical Applications and Best Practices
Using Pedigree Visualization Effectively
Effective pedigree visualization is both an art and a science. Here are some best practices for creating clear, informative pedigree visualizations:
Challenge | Solution | Implementation |
---|---|---|
Large pedigrees become unwieldy | Focus on subtrees of interest | Extract subtrees using get_subdict() before rendering |
Unclear relationship types | Use consistent visual encoding | Standardize edge styles and colors for relationship types |
Difficulty identifying key individuals | Highlight focal individuals | Use the focal_id parameter or custom colors |
Overlapping edges in complex pedigrees | Adjust layout algorithms | Try different Graphviz engines (dot, neato, fdp) |
Uncertainty in relationships | Encode confidence visually | Use dashed lines or color gradients for uncertain connections |
Common Applications of Pedigree Visualization
- Verifying relationship hypotheses by visualizing how they fit into existing family structures
- Identifying potential connections between seemingly unrelated individuals
- Documenting complex family histories for genealogical research
- Communicating findings to family members and other researchers
- Validating genetic analysis results by ensuring they form biologically plausible structures
Extending Bonsai's Visualization Capabilities
Beyond Basic Rendering
While Bonsai's render_ped
function provides a solid foundation, it can be extended in various ways to create more sophisticated visualizations:
Interactive Visualizations
Converting static pedigree diagrams to interactive visualizations using tools like D3.js or Plotly can significantly enhance user exploration:
- Zooming and panning for navigating large pedigrees
- Hover tooltips showing detailed information about individuals
- Collapsible subtrees for managing complexity
- Dynamic filtering to show specific relationships or IBD thresholds
Integrating Multiple Data Types
Pedigree visualizations can be enhanced by incorporating additional data types:
- Historical records linked to specific individuals
- Geographic information showing migration patterns
- Ethnicity estimates encoded in node colors or patterns
- Timeline information showing temporal relationships
Community Extensions
The genetic genealogy community has developed various extensions to standard pedigree visualization:
- McGuire diagrams for visualizing shared DNA between multiple individuals
- Fan charts for compact representation of ancestral relationships
- DNA painter style visualizations for chromosome mapping
- Network graphs showing complex interrelationships in endogamous populations
Conclusion and Next Steps
Pedigree rendering and visualization are essential components of computational genetic genealogy, transforming abstract genetic relationships into intuitive visual forms. Bonsai v3's rendering capabilities provide a flexible foundation for creating clear, informative pedigree visualizations that help users interpret genetic data in the context of family structures.
By customizing node and edge attributes, integrating IBD sharing information, and applying effective visual design principles, pedigree visualizations can become powerful tools for understanding complex family relationships.
In the next lab, we'll explore how to interpret results and assess confidence in relationship predictions, complementing the visual representations we've explored in this lab with statistical measures of certainty.
Your Learning Pathway
Interactive Lab Environment
Run the interactive Lab 21 notebook in Google Colab:
Google Colab Environment
Run the notebook in Google Colab for a powerful computing environment with access to Google's resources.
Data will be automatically downloaded from S3 when you run the notebook.
Note: You may need a Google account to save your work in Google Drive.
This lab is part of the Visualization & Advanced Applications track:
Rendering
Lab 21
Interpreting
Lab 22
Twins
Lab 23
Complex
Lab 24
Real-World
Lab 25
Performance
Lab 26
Prior Models
Lab 27
Integration
Lab 28
End-to-End
Lab 29
Advanced
Lab 30