Java Web Start links -------------------- workshop page (has all the links you will need today): http://gaggle.systemsbiology.org/pshannon/bioc2007/ boss: http://gaggle.systemsbiology.org/2005-11/boss/boss.jnlp name translator: http://gaggle.systemsbiology.org/nameTranslations/humanStringToGeneID.70.jnlp DMV: http://gaggle.systemsbiology.net/2005-11/dmv.jnlp Cytoscape 1.2, for Homo sapiens, Entrez Gene IDs: http://gaggle.systemsbiology.org/pshannon/cy12/blankSlate/human/cy.jnlp R commands ---------- > library (gaggle) > gaggleInit () > geese () > setSpecies ('Homo sapiens') # read the expression matrix > m = read.table ('matrix.tsv', sep='\t') # which rows have highly variable expession? > rows = which (apply (m, 1, function (row) IQR (row) > 5)) # what are the entrex gene ID's for these rows? > genes = row.names (m) [rows] # make sure that the Firegoose is registered with the boss # set the 'target goose' -- the program which will receive the broadcast # of these five genes. # first: which geese are currently running and known to the boss? > geese () # specify the Firefox browser with the gaggle extension, aka 'Firegoose' > setTargetGoose ('Firegoose') # define a convenience function to save some typing on subsequent assignment # of the target goose > stg = function (g) setTargetGoose (geese () [g]) # now broadcast the genes > broadcast (genes) # we will now make a number of manipulations in the brower, eventually resulting # a broadcast of an expanded list of genes back to R > from.string = getNameList () # take a look at the identifiers > from.string # in your browser, return to the workshop web page, and launch the nameTranslator # goose. rebroadcast identifiers from STRING, through the nameTranslator, and # thence to R > from.string = getNameList () # take a look at the identifiers. they should now (mostly) be entrez geneID's > from.string # how many of are there? > length (from.string) # how many are in in the expression matrix? > olength (intersect (from.string, rownames (m))) # suggested by STRING, in our data 13 # what are the new genes, found by STRING not selected by our original stringent non-specific filter? > new.genes = setdiff (intersect (from.string, rownames (m)), genes) # find out the variability of these new genes > sapply (new.genes, function (row) IQR (m [row, ])) 3108 3112 912 7078 3109 4846 207 960 972 0.6989106 0.2881418 0.6021836 0.1142142 3.8377716 0.9327259 0.3628950 0.0000000 4.1485932 [1] 24 # now go back to the STRING network in your browser, and broadcast the network to R, being # sure to broadcast through the nameTranslator # 'network ready, node count 24, edges: 60' > network = getNetwork () # ignore warning messages # get a summay of the network > network A graphNEL graph with undirected edges Number of Nodes = 24 Number of Edges = 60 # see that a url for the evidence for the associations came back with with broadcast: > edgeData (network)[[1]][[2]] [1] "http://string.embl.de/newstring_cgi/show_edge_data.pl?taskId=y2PwtphnXwvx&node1=322867&node2=322518" > browseURL (edgeData (network)[[1]][[2]]) # next exercise: launch DMV and Cytoscape, play an expression movie # note: the geese running on your computer will probably be named slighly differently # then those show below: version numbers ('-01', '-05') will almost certainly be # different. keep this in mind, and set target geese appropriately > geese () [1] "Human-01" "DMV-01" [3] "Network" "Firegoose" [5] "STRING-GeneID, Human, v7.0 "R-05 > setTargetGoose (DMV-01) # or stg (2) > stg (2); showGoose () # bring DMV to the front > broadcast (m, 'ratios') > setTargetGoose ('Human-01') # or stg (1) > broadcast (network)