Cytoscape Tutorial
Return to SNA - Terms, Links, and Resources
View Presentation Slides
1. Prepare the Data
What is below assumes you are taking a whole network approach. When you collect information from your sources, think about your data in terms of attributes relating to individual nodes (whether they be individuals, organisations, texts etc.) and ties between them. The ties should be all of the same kind unless you include edge attribute column that labels the type of relationship according to a particular documented ontology.
Create a Node Table - Record data about your node agents in a table using Microsoft Excel, Apple Numbers, or LibreOffice. Have one column with a unique ID number for each agent (See my YouTube tutorial for more on how to add this after the fact). While some forgo this, if you have several main sources that supply information about your nodes, consider keeping this data in separate sheets in the spreadsheet you can disaggregate easily the source of particular claims about your source. You can then create a single dynamically loaded merged sheet where you can see all these claims juxtaposed in a single overall node table (See my YouTube tutorial for how to do this).
Create an Edge Table - Record data about the relationships between your agents in one or more (if bimodal, keep a separate edge table for relationships between different types of nodes such as human to human relationships and human to institution relationships, etc.). Decide whether you want to have a directed or undirected network. If the latter, you only need one row of information to indicate a relationship between two nodes and the order you put these is not important. If the former, the order you put the information (in two separate columns for the source and target node) will indicate the direction of the relationship. If bi-directional, you need two or more rows.
Note: Correspondence Networks - If you are tracking letters between agents, the passing of a letter in one direction or another is suggestive of a directed network. While it is useful to keep track of these letters as entries on a spreadsheet in their own right, it is also useful to have an edge table which aggregates the information in each direction (total letters from A to B and perhaps the time range in begin and end columns, then a single row for B to A). You can use ‘pivot tables’ in spreadsheet to aggregate the data. The total letters can thus be styled as the “weight” of the relationship in either direction when added to the SNA software. Alternatively, you may decide (though be aware of the shortcomings of this assumption) to treat the correspondence network as undirected and aggregate total letters passed in either direction.
- Files for Import: When you are ready to import the tables you have created, you have several options:
- Cytoscape can import from Excel spreadsheets directly if you indicate which sheet contains the edge and node tables. However, I have seen serious data integrity problems with this, at least when the ID fields are generated from lookup functions. I would suggest you export your data from the spreadsheet and import it as a CSV.
- From any spreadsheet software you can export your data as a tab or comma delimited value file (CSV in the latter case) which is a standard portable format that can be easily imported. If you do this, save your node table and edge table in separate CSV (or tab-delimited) files. When you export, consider encoding it as UTF–8 (for Excel, this is the CSV UTF–8 (Comma-delimited) (.csv) option in Save As…) to prevent loss of diacritics or non-Roman characters.
2. Import Network - the Edge Table
When you open Cytoscape, begin by saving your session in the File menu.
- First import your edge table with the Import Network from File button or sub-menu of the same in the File - Import.
- If you are importing from an excel spreadsheet, you will need to indicate which sheet you want to pull the data from
- If you are importing a CSV or tab-delimited list, you merely select the appropriate file for the edge table. If your file is tab-delimited you need to indicate this in the advanced options.
You will be presented with a dialog box where you can indicate which column contains the source nodes (a green circle) and which contains the target nodes (a red target symbol). Other elements can be left as edge attributes. Press OK.
Your network should now be visible but still has nothing in the node table in the table panel below, except for the names.
Note: In Cytoscape you also have the option of importing all your data through a single file (an edge table which includes all the attributes of nodes as well, in every row) or, as above, in two steps with an edge table and a node table. I suggest keeping your edge table (of relations) and node table (describing your nodes) separate as good database practice.
3. Import Table - the Node Table
- Now select the table icon button Import Table from File, also found as a sub-menu in File - Import. Choose your node table. The same applies as above regarding Excel spreadsheet or CSV import.
- You will be presented with a dialog box with some options
- The “where to import table data” is likely to be “to a network collection” assuming you have only one network in Cytoscape
- Importantly, the “Import Data as:” should say “Node Table Columns.”
- In the preview of the columns below, the only important thing is that you have selected the column that is the key which contains the list of nodes that correspond to the source and target node columns in your edge table. Good data practice will use a set of unique IDs for each node.
- After you import your node table, you should see, in the “Node Table” tab of the table panel in the bottom right, the node attributes you have imported added.
4. A Quick Run Through
- I’ll demonstrate a common workflow of getting data in, styling the network in various ways and exporting the output.
- I’ll note that you don’t have to import tables. It is possible to create the network from scratch directly inside of Cytoscape by creating a new Empty network, right-clicking on the workspace and adding nodes directly onto the canvas.
- I have produced a YouTube screencast which reproduces this quick run through: Simple Network Visualisation for Historians using Cytoscape
5. The Workspace: Note the network graph on the right in the workspace.
- You can easily search for a node or even edges through the search box at the top right. This will search not only the visible label information but also other information in the attributes of edges and nodes. The items that have been ‘found’ will be selected and can be moved or manipulated as a unit. This doesn’t work well for numerical searches. Consider also the Filter features in the control panel for more complex filtering.
- You select a node or edge by clicking it, or finding it through search. Whether only nodes, only edges, or both edges and nodes can be selected will depend on what icons you have turned on in the bottom right of the workspace, or in the Mouse Drag Selects menu in the Select menu.
- You can select a range of multiple nodes and edges by holding down the command key (on a mac) and then dragging across a group of nodes or edges.
- You can select multiple nodes and edges by holding down the shift or the command key and then clicking on multiple elements.
- You can “click off” in the blank space of the graph to deselect things, or choose Deselect all Nodes and Edges from the Select Menu
- You can grab the neighbours of a selection with the icon showing the two houses or in the Select menu under Nodes and First Neighbors of Selected Nodes (there you can specify if you between handling the network as undirected or directed for the selection).
- You can take any group of selected nodes and create a new network from them with the double document icon button or New Network from Selected Nodes in the File menu. You can specify if you want to include all edges between selected nodes or only those you have explicitly selected.
- You can “hide” nodes that are selected with the eye icon or Hide Selected Nodes or Edges in the Select menu.
- You can manually build onto the network directly here with right clicks:.
- For example, you can add an empty node by right clicking in an empty space and choosing: Add - Node
- You can add edges between two or more nodes by first selecting them, then right clicking on them and choosing Add - Edges Connecting Selected Nodes
- Right-clicking on the workspace gives you a full range of options and commands. The range of options presented to you differ, depending on whether you click on an empty area of the workspace or directly on a node, and on whether there are any nodes selected.
6. The Control Panel: Note the available views at the left in the control panel:
- As is true for the workspace, right-clicking (control-clicking) does a lot in Cytoscape
- Network:
- here you can “destroy” a network with a right-click on a network or sub-network and Destroy Network (any sub-networks will not be destroyed unless you destroy the whole “collection”).
- The number of nodes and edges will be seen next to the network name.
- Any “sub-networks” you create from filters will be listed hierarchically under here.
- If you have a saved style you can apply it here with a right-click and Apply Style
- Style is where you visually style the network.
- note that there are there tabs for styling: Node, Edge, and the rarely useful Network
- exporting a style.
- If you right click on a definition style for any characteristic of a node or edge, you can reset it to the default.
- If you want to remove a style mapping to a particular column, you can right click further to the right where you see the description of the feature and remove mapping.
- Filter is where you can build a filter to select nodes or edges which match certain criteria
- Annotation is for simple lables and other graphics but I suggest doing any of this in another vector graphics application after exporting your network as an image.
7. The Table Panel: Note the table data at the bottom right in the table panel.
- This will show your node table, the edge table, and a usually not useful network table
- With the node table showing, when you click on a node or shift-click on several nodes, it will show the data for only the selected nodes. Same for edges. Note, that you must have the relevant table showing to note this change.
- You node table may have an extra column with “shared” in it, which duplicates the key column. Similarly, the edge table has some extra columns not in your original table, including an interaction, which by default is “interacts with” and a unique name that is a combination of the key value of the origin and target nodes of an edge.
- You can edit values directly. If you want to do more advanced things like adding columns, using equations to derive values, etc. Although these features are present in Cytoscape, I recommend you do them in Excel or Apple Numbers instead as they are clunky features here.
- File
- New Network - If you have selected some nodes or nodes and edges, you can create a new sub-network from the selection.
- Import - You can access the import network and import table features from here.
- Export - You can export your visualised network from here as an image. You can also export the network to a file, or export your styling to a file for later re-use.
- Select
- Mouse Drag Selects - Here you can toggle the ability to select only nodes, edges, or both
- Edges - Perform actions such as hide and show on the selected edges
- Show all Nodes and Edges - Useful when you have hidden some edges from view.
- Hide Selected Nodes and Edges - Will hide selected nodes and edges.
- Nodes - Perform actions on selected nodes. The First Neighbors of Selected Nodes feature will ‘walk’ a path selecting nodes in all possible directions by increments of one distance.
- Layout
- This menu offers you access to a large variety of layout algorithms which will change the distribution of your nodes and edges in the workspace. The Apply Preferred Layout is a way to go back to a default position (or make use of the Undo feature in the Edit menu!). Unless you change the default in the Settings your default layout algorithm will be the Prefuse Force Directed Layout.
- Layout Tools - This will make a new tools panel appear in the bottom left where you can customise some of the layout elements, scale things with width, height, or both fixed, and rotate your graph.
- Clear All Edge Bends - If a layout you applied bends the edges to make them look curved, and you want to remove this, use this menu command.
- Tools
- Analyze Network - This is the beginning of any more formal analysis. It will calculate a number of common network measures for your network and add them as columns.
- Apps
- Cytoscape has a fammily of plug-ins and other extra features that can be installed here, and when installed, may add additional menu options here.
DH Tutorials Home
The GitHub Repository for this workshop and its files.
Konrad M. Lawson. Creative Commons - Attribution CC BY, 2020.