Introduction

This notebook shows how to locate Transcription Factors (TFs) in pypath.

Analysis

In [1]:
# Show all the plots inside the notebook
%matplotlib inline
In [2]:
# load packages
import pypath
import igraph  # import igraph to use the plot function

import numpy as np
import pandas as pd
import seaborn as sns
In [3]:
pa = pypath.PyPath()

	=== d i s c l a i m e r ===

	All data coming with this module
	either as redistributed copy or downloaded using the
	programmatic interfaces included in the present module
	are available under public domain, are free to use at
	least for academic research or education purposes.
	Please be aware of the licences of all the datasets
	you use in your analysis, and please give appropriate
	credits for the original sources when you publish your
	results. To find out more about data sources please
	look at `pypath.descriptions` and
	`pypath.data_formats.urls`.

	» New session started,
	session ID: '8a5wl'
	logfile:'./log/8a5wl.log'.
In [4]:
pa.init_network()
	:: Loading  from cache, previously downloaded from www.uniprot.org
	:: Loading  from cache, previously downloaded from www.uniprot.org
	:: Loading HUMAN_9606_idmapping.dat.gz from cache, previously downloaded from ftp.uniprot.org
	:: Processing ID conversion list: finished, 100.0%
 » SignaLink3
	:: Reading from cache: cache/signalink3.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `netbiol_effect` has multiple types of values: str, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `netbiol_mechanism` has multiple types of values: str, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `netbiol_is_direct` has multiple types of values: str, list
 » NetPath
	:: Reading from cache: cache/netpath.edges.pickle
	:: Loading 'genesymbol' to 'uniprot' mapping table
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » DOMINO
	:: Reading from cache: cache/domino.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » AlzPathway
	:: Reading from cache: cache/alzpathway.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » CancerCellMap
	:: Reading from cache: cache/cancercellmap.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » ARN
	:: Reading from cache: cache/arn.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `netbiol_effect` has multiple types of values: unicode, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » DeathDomain
	:: Reading from cache: cache/deathdomain.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » ELM
	:: Reading from cache: cache/elm.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » CA1
	:: Reading from cache: cache/ca1.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `ca1_type` has multiple types of values: unicode, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `ca1_effect` has multiple types of values: unicode, list
 » DEPOD
	:: Reading from cache: cache/depod.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » TRIP
	:: Reading from cache: cache/trip.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » HPRD
	:: Reading from cache: cache/hprd.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `hprd_mechanism` has multiple types of values: str, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » SPIKE
	:: Reading from cache: cache/spike.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `spike_effect` has multiple types of values: unicode, list
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `spike_mechanism` has multiple types of values: unicode, list
 » LMPID
	:: Reading from cache: cache/lmpid.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » dbPTM
	:: Reading from cache: cache/dbptm.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » phosphoELM
	:: Reading from cache: cache/phosphoelm.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » MatrixDB
	:: Reading from cache: cache/matrixdb.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » InnateDB
	:: Reading from cache: cache/innatedb.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » MPPI
	:: Reading from cache: cache/mppi.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » NRF2ome
	:: Reading from cache: cache/nrf2ome.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `netbiol_effect` has multiple types of values: unicode, list
 » Signor
	:: Reading from cache: cache/signor.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » Macrophage
	:: Reading from cache: cache/macrophage.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » PDZBase
	:: Reading from cache: cache/pdzbase.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » PhosphoSite
	:: Reading from cache: cache/phosphosite.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » BioGRID
	:: Reading from cache: cache/biogrid.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » Guide2Pharma
	:: Reading from cache: cache/guide2pharma.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 » DIP
	:: Reading from cache: cache/dip.edges.pickle
	:: Processing nodes: finished, 100.0%
	:: Processing edges: finished, 100.0%
	:: Processing attributes: finished, 100.0%
WARNING:pypath.logn:### WARNING ###### Vertex attribute `name` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `label` has multiple types of values: str, unicode
WARNING:pypath.logn:### WARNING ###### Vertex attribute `exp` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative` has only None values
WARNING:pypath.logn:### WARNING ###### Edge attribute `negative_refs` has only None values
 :: Comparing with reference lists... done.

 » 29949 interactions between 7476 nodes
 from 27 resources have been loaded,
 for details see the log: ./log/8a5wl.log

We will use GO annotations to locate TFs.

In [5]:
# load go annotations:
pa.load_go()
	:: Loading gene_association.goa_human.gz from cache, previously downloaded from ftp.ebi.ac.uk
	:: Loading GO annotations: finished, 100.0%
In [6]:
# get the GO annotation:
pa.go_dict()
	:: Loading GAnnotation from cache, previously downloaded from www.ebi.ac.uk
In [7]:
# get also the directed network
pa.get_directed()
#pa.ugraph = pa.graph
#pa.graph = pa.dgraph
	:: Setting directions: finished, 100.0%
In [8]:
# list names instead of IDs:
# (9606 is an NCBI taxonomy ID)
map(pa.go[9606].get_name, set(pa.gs('GATA1')['go']['C']))
Out[8]:
['nucleoplasm',
 'nucleus',
 'transcriptional repressor complex',
 'transcription factor complex']
Some GO terms that may be useful: (C) transcription factor complex (C) transcriptional repressor complex (P) cell surface receptor signaling pathway ( ) plasma membrane receptor complex (C) plasma membrane (C) cell surface
In [9]:
tf = pa.dgraph.vs.select(lambda vertex: pa.go[9606].get_term('transcription factor complex') in vertex['go']['C'])
tfr = pa.dgraph.vs.select(lambda vertex: pa.go[9606].get_term('transcriptional repressor complex') in vertex['go']['C'])
print('Number of nodes annotated as \'transcription factor complex\': {}'.format(len(tf)))
print('Number of nodes annotated as \'transcriptional repressor complex\': {}'.format(len(tfr)))
# Note: some nodes may be annotated with both GO terms
print('Number of nodes annotated with any of the two terms above: {}'.format(len(set(tf['label']+tfr['label']))))
Number of nodes annotated as 'transcription factor complex': 124
Number of nodes annotated as 'transcriptional repressor complex': 37
Number of nodes annotated with any of the two terms above: 155

We can also look for nodes annotated with several GO terms. For example, we can try to locate all the nodes corresponding to cell membrane proteins located in its surface.

In [10]:
filter_func = lambda vertex: pa.go[9606].get_term('cell surface') in vertex['go']['C'] and pa.go[9606].get_term('plasma membrane') in vertex['go']['C']
pm = pa.dgraph.vs.select(filter_func)
print('Number of nodes annotated with \'cell surface\' and \'plasma membrane\': {}'.format(len(pm['label'])))
Number of nodes annotated with 'cell surface' and 'plasma membrane': 215
map(pa.go[9606].get_name, set(pm[0]['go']['F']))

Locate nodes with no inputs or no outputs. Also, check that there are no isolated nodes.

In [11]:
only_in = pa.dgraph.vs.select(lambda vertex: vertex.outdegree()==0)
only_out = pa.dgraph.vs.select(lambda vertex: vertex.indegree()==0)
isolated = pa.graph.vs.select(lambda vertex: vertex.degree()==0)
print('Number of nodes with no output arcs: {}'.format(len(only_in)))
print('Number of nodes with no input arcs: {}'.format(len(only_out)))
print('Number of nodes with no arcs: {}'.format(len(isolated)))
Number of nodes with no output arcs: 2204
Number of nodes with no input arcs: 940
Number of nodes with no arcs: 0
map(pa.go[9606].get_name, set([i for sublist in only_in for i in sublist['go']['C']]))

Paths between surface proteins and TFs

In [12]:
dnode_list = set()
rows = pm['label']
cols = tf['label'] + tfr['label']
ddistance = pd.DataFrame(np.nan, index=rows, columns=cols)
for igene1 in rows:
    for igene2 in cols:
        path = pa.dgraph.get_shortest_paths(pa.dgenesymbol(igene1)['name'], to=pa.dgenesymbol(igene2)['name'])[0]
        dnode_list.update(path)
        ddistance.loc[igene1, igene2] = len(path)-1 if len(path)>0 else np.nan
/Applications/anaconda/envs/py27/lib/python2.7/site-packages/ipykernel/__main__.py:7: RuntimeWarning: Couldn't reach some vertices at structural_properties.c:740
In [13]:
interconnection_dgraph = pa.dgraph.induced_subgraph(dnode_list)
In [14]:
# for directed graphs with many edges, plotting the network may be prohibitive
#igraph.plot(interconnection_dgraph, layout=interconnection_dgraph.layout_auto(), vertex_label=None)
In [15]:
sns.plt.hist(interconnection_dgraph.degree(), bins=100)
Out[15]:
(array([ 76.,  51.,  87.,  42.,  34.,  61.,  22.,  42.,  19.,  16.,  30.,
          9.,  17.,  27.,   8.,  18.,   6.,  10.,  11.,   5.,   6.,   9.,
          5.,   7.,   4.,   2.,   6.,   3.,   4.,   5.,   2.,   3.,   2.,
          4.,   3.,   2.,   1.,   1.,   1.,   2.,   0.,   0.,   1.,   0.,
          1.,   0.,   2.,   2.,   1.,   1.,   2.,   0.,   2.,   1.,   0.,
          0.,   2.,   1.,   1.,   0.,   1.,   0.,   1.,   0.,   0.,   1.,
          0.,   1.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   1.,
          1.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,   0.,
          0.,   1.,   0.,   0.,   0.,   0.,   1.,   0.,   0.,   0.,   0.,
          1.]),
 array([   1.  ,    2.38,    3.76,    5.14,    6.52,    7.9 ,    9.28,
          10.66,   12.04,   13.42,   14.8 ,   16.18,   17.56,   18.94,
          20.32,   21.7 ,   23.08,   24.46,   25.84,   27.22,   28.6 ,
          29.98,   31.36,   32.74,   34.12,   35.5 ,   36.88,   38.26,
          39.64,   41.02,   42.4 ,   43.78,   45.16,   46.54,   47.92,
          49.3 ,   50.68,   52.06,   53.44,   54.82,   56.2 ,   57.58,
          58.96,   60.34,   61.72,   63.1 ,   64.48,   65.86,   67.24,
          68.62,   70.  ,   71.38,   72.76,   74.14,   75.52,   76.9 ,
          78.28,   79.66,   81.04,   82.42,   83.8 ,   85.18,   86.56,
          87.94,   89.32,   90.7 ,   92.08,   93.46,   94.84,   96.22,
          97.6 ,   98.98,  100.36,  101.74,  103.12,  104.5 ,  105.88,
         107.26,  108.64,  110.02,  111.4 ,  112.78,  114.16,  115.54,
         116.92,  118.3 ,  119.68,  121.06,  122.44,  123.82,  125.2 ,
         126.58,  127.96,  129.34,  130.72,  132.1 ,  133.48,  134.86,
         136.24,  137.62,  139.  ]),
 <a list of 100 Patch objects>)
In [16]:
sns.plt.plot(ddistance.as_matrix().ravel(), '.')
Out[16]:
[<matplotlib.lines.Line2D at 0x12c7d7f90>]
In [17]:
tmp = ddistance.as_matrix().ravel()
sns.plt.hist(tmp[~np.isnan(tmp)], bins=100)
Out[17]:
(array([  1.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   2.60000000e+01,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   5.80000000e+02,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   3.44500000e+03,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          5.63100000e+03,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   3.62200000e+03,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   1.19700000e+03,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   3.46000000e+02,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          2.88000000e+02,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          1.29000000e+02,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   1.90000000e+01,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   9.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
          2.00000000e+00]),
 array([  0.  ,   0.12,   0.24,   0.36,   0.48,   0.6 ,   0.72,   0.84,
          0.96,   1.08,   1.2 ,   1.32,   1.44,   1.56,   1.68,   1.8 ,
          1.92,   2.04,   2.16,   2.28,   2.4 ,   2.52,   2.64,   2.76,
          2.88,   3.  ,   3.12,   3.24,   3.36,   3.48,   3.6 ,   3.72,
          3.84,   3.96,   4.08,   4.2 ,   4.32,   4.44,   4.56,   4.68,
          4.8 ,   4.92,   5.04,   5.16,   5.28,   5.4 ,   5.52,   5.64,
          5.76,   5.88,   6.  ,   6.12,   6.24,   6.36,   6.48,   6.6 ,
          6.72,   6.84,   6.96,   7.08,   7.2 ,   7.32,   7.44,   7.56,
          7.68,   7.8 ,   7.92,   8.04,   8.16,   8.28,   8.4 ,   8.52,
          8.64,   8.76,   8.88,   9.  ,   9.12,   9.24,   9.36,   9.48,
          9.6 ,   9.72,   9.84,   9.96,  10.08,  10.2 ,  10.32,  10.44,
         10.56,  10.68,  10.8 ,  10.92,  11.04,  11.16,  11.28,  11.4 ,
         11.52,  11.64,  11.76,  11.88,  12.  ]),
 <a list of 100 Patch objects>)

Using Pypath for retrieving TFs

In [18]:
pa.set_transcription_factors()
pa_tf = pa.transcription_factors()
	:: Loading nrg2538-s3.txt from cache, previously downloaded from www.nature.com
	:: Loading HUMAN_9606_idmapping.dat.gz from cache, previously downloaded from ftp.uniprot.org
	:: Processing ID conversion list: finished, 100.0%
	:: Loading 'uniprot-sec' to 'uniprot-pri' mapping table
	:: Loading 'genesymbol' to 'trembl' mapping table
	:: Loading 'genesymbol' to 'swissprot' mapping table
	:: Loading 'genesymbol-syn' to 'swissprot' mapping table
	:: Loading 'hgnc' to 'uniprot' mapping table
In [19]:
pa_tf = pa.graph.vs.select(lambda vertex: vertex['tf'] is True)
len(pa_tf)
Out[19]:
530
with pypath.dataio.cache_off(): pa.set_receptors()
In [31]:
pa.set_receptors()
	:: Loading findGenes.asp from cache, previously downloaded from receptome.stanford.edu
	:: Loading 'genesymbol-syn' to 'uniprot' mapping table
In [32]:
pa_rec = pa.graph.vs.select(lambda vertex: vertex['rec'] is True)
len(pa_rec)
Out[32]:
966
pypath.dataio.get_hpmr()
In [34]:
pypath.data_formats.urls['hpmr']['url']
Out[34]:
'http://receptome.stanford.edu/hpmr/SearchDB/findGenes.asp?textName=*'
html = pypath.dataio.curl(pypath.data_formats.urls['hpmr']['url'], silent = False)pa.graph.es[0]pa.graph.es[0]['sources_by_type']pa.graph.es[0]['type']
In [42]:
tf_by_go = list(tf['label'] + tfr['label'])
In [43]:
tf_by_pa = list(pa_tf['label'])
In [44]:
tf_shared = set(tf_by_pa).intersection(tf_by_go)
len(tmp3)
Out[44]:
93
In [47]:
rec_by_go = list(pm['label'])
rec_by_pa = list(pa_rec['label'])
rec_shared = set(rec_by_pa).intersection(rec_by_go)
len(rec_shared)
Out[47]:
134
In [ ]: