home
 
 
 
76~90
Thunderbolts Forum


Plasmatic
Re: Online scientific discourse is broken and it can be fixe

CC said:
And if we're going to keep debating the issues at the highest of philosophical levels, I guarantee you that this will never lead anywhere!
I think both Chris and I would reject this premise. That being said, I agree that actual use of, and discussion about the tools that already exist is essential as well.

CharlesChandler
Re: Online scientific discourse is broken and it can be fixe

Plasmatic wrote:
And if we're going to keep debating the issues at the highest of philosophical levels, I guarantee you that this will never lead anywhere!
I think both Chris and I would reject this premise. That being said, I agree that actual use of, and discussion about the tools that already exist is essential as well.
If the online discussion is about how great ideas get lost in online discussions, why wouldn't the great ideas in this discussion fall prey to the subject of the discussion??? :) In order to get some traction here, we have to start structuring our thoughts, so that we can see the logic laid out, and start building on it. That's all I'm saying. As you know, I consider myself first and foremost to be a philosopher, and I'm not scared of debating the metaphysical and epistemic substrate of these issues. But I'll only go about it in a structured way. I don't have time for open-ended conversations — I want to see an article produced, or at least an outline, and for the next round to build on what has gone before. ;)

Lloyd
Re: Online scientific discourse is broken and it can be fixe

Chris and Plas
Chris, I was thinking you're Zane, because I keep getting your and his usernames mixed up. Plas, do you care to mention your name? It'll come back to me when you tell me, I think, because I believe I interviewed you in 2007 on this forum before it crashed. Didn't I?

It's nice to be having a discussion with all of you. Chris, I know you were talking along these lines a couple years or so ago when I was on that other forum with you all. I was interested in the subject then too, but the mods there didn't allow me to do much discussing.

Charles' Outline - Charles, I just checked out your outline at http://qdl.scs-inc.us/?top=5960. It looks like you're trying to outline some of the main points of this discussion. Here's the topic outline:

Strategy: employ an improved scientific method.
-- The scientific community is dominated by conformists.
-- There are a LOT of hypotheses out there.
-- Solutions
---- Develop a system with user-definable permissions.
---- Use a combination of threads and outlines.
---- Allow collapsible descriptions of outline items.
---- Articles can be written to flesh out the outlines.
---- Cross-link just the proven hypotheses into the Scientific Knowledge folder.
---- Software has to support user rankings of hypotheses.
---- A non-linear hierarchical system reduces redundancies across competing world views.


Plan
I wrote up a few years ago what I think is a fairly complete version of a proper Scientific Method, which I posted on the other forum and this one and again on your site, Charles. I suppose the improved sci method in this outline is related. Barbara Sher has a planning method for achieving true goals, which she calls Wishcraft. She says failing to plan is planning to fail. I've had fair results with her method and try to use it pretty often. She says it can work for groups as well as individuals, at least for any goals that the members of the group share. After determining goals, she says to plan by working backwards in action steps from the goal in the future to the present.

I assume that we all share the goal of improving science and removing from it tendencies toward error, such as corruption by anyone with ulterior motives. Here's an initial outline for a plan.

Goal: Improve Science and Maximally Reduce All Tendencies toward Error.
4. Public Library of Science Knowledge is established online
3. A wiki system similar to Wikipedia is built to hold all Science Knowledge
2. Efficient Process of sorting Known, Unproven and Disproven theories is established
1. A group of collaborators organizes for the above purpose


Do we all have interest in being collaborators along these or similar lines?
Does this planning outline cover the basic action steps needed to achieve the goal?
Do we need to be moving this discussion to Charles' site asap in order to start developing a more efficient structure for collaboration?
Or is it best to discuss in both locations?

Charles, do you want to have Reply buttons in the outline?

pln2bz
Re: Online scientific discourse is broken and it can be fixe

Over the years, I have attempted to use such technologies in the various intellectual endeavors that consume my free time, but I have never stuck with any of them. I think that my biggest problem with block diagrams is just that there really isn't very much information in view at any one point in time, and the extremely narrow perspective detracts way too much value. Though I think visually, and my professional expertise is in graphics programming, I always lay out ideas in text, and if there is any significant structure to it, I'll do it in outline format. It's just faster that way, and I think that most people would agree. Block diagrams are time-consuming to produce, and only convey a few simple ideas. They're extremely useful for presentations. But nobody seems to use them for capturing and nurturing the initial thoughts, despite how much effort has been put into developing such technologies. My opinion is that if this was the way to go, we'd already be using Visio for everything (on our desktops, or online). There's a reason for why we aren't. Why is that?
Chandler, I am really enjoying your action-oriented interest. We seem to share a passion for the same subject, and I honestly didn't expect to find people who have already been coding this type of project. It's to be expected that we will also inevitably differ in the best way to move forward. What I try to do is to be extremely cautious about the largest, most important decision points. When I am unsure about how to proceed, what I generally do is expose myself to the relevant domains. In this particular case, on the specific topic of visualizing science, knowledge and controversy, there is actually an enormous volume of pre-existing content which I had already intended to discuss on the related subjects of …

  • As mentioned, debate mapping
  • Concept mapping
  • User interaction
  • Knowledge mapping
  • Social network analysis & research on the long tail
  • Graph databases
  • HTML5's Canvas

I've pushed these subjects off to the tail of the discussion because my take is that the site's visualization should be fashioned as a response to the problems we currently observe in how people discuss and think about science. So, I've been following what I perceive to be a necessary order here, given what I have so far, but I'm at the cusp now of transitioning from talking about those problems to now discussing various potential ideas for solutions — visualizations being an essential topic.

One of the things I've been trying to demonstrate with this thread is to show how the approach deployed will generally determine the result one gets. My own approach is inherently interdisciplinary and cautious. So, on this topic of visualization, what I would recommend is that we intentionally keep our minds open on this question you ask above. I'd be willing to bet that as we talk about visualization, peoples' opinions about what is possible and preferable could dramatically shift. My goal at this early stage is to take the three years of research that I've already done, and condense this into a short summary that others can do as they wish with. If some sort of consensus, by chance, emerges, my hope is that it is 100% organic, and based upon a cautious look at the widest-breadth view of the related domains. I'm intentionally trying to avoid any language here which forces people to think or act in any particular manner on this subject.

Based on what I've seen to date, I have a deep suspicion that this visualization problem you mention above can indeed be solved, and I suspect that the reason that it has eluded people for so long has to do with the approach that people tend to take in attacking the problem. There seems to be a tendency for people to limit the scope of their investigation. My preference is to continue with this thread here on the Thunderbolts forum, and to continue to push myself to explore ideas and domains which are currently unfamiliar to myself. Please don't interpret that as some sort of rejection of your own preferred outline-based approach. If that helps you to make progress on this problem, then you should go with it, try to perfect it and share what you learn. It's conceivable and even probable that there is actually more than one solution to the problem. We'd be wise to simply fork wherever we see fit, but to still try to learn from one another's efforts. The important point, I think, is to not become discouraged when people disagree about how to continue, and to keep the conversations going.

One of the things that I've learned about starting a business, to date, is that most people lack all of the skills required to actually do it. And for each of those people, there will turn out to be specific reasons for why this is so which relate to our own psychology. For instance, I used to work on a tech support line at a semiconductor company, and it caused me so much stress that it ultimately led me to dislike communicating on the phone to this day. So, for me at least, part of the process of solving wicked problems necessarily involves confronting my own personal barriers which I've over time set up which are obstructing my own success. What I would suggest as a possibility is that the reason that people have thus far failed to solve the knowledge visualization problem — as with any huge, "wicked" problem — is that, of those who have dedicated themselves to this huge task (which is actually a small set of people), none of them managed to overcome each of the numerous obstacles which preclude a solution (ideational, financial, social, even computing power, etc). But, have no doubt that when somebody does finally solve it, in retrospect, if the idea is actually a good one, people will look back and think, "That's obvious. Why didn't I think of that?"

This is how innovation seems to work. It's an incredible personal challenge which, in retrospect, will involve apparently obvious solutions.

pln2bz
Re: Online scientific discourse is broken and it can be fixe

There are a couple of trending stories which relate to our thread here. There is one which pertains to the failure of scientometrics to act as a useful guide for judging science here:
How journals like Nature, Cell and Science are damaging science
The incentives offered by top journals distort science, just as big bonuses distort banking
Randy Schekman
The Guardian

I am a scientist. Mine is a professional world that achieves great things for humanity. But it is disfigured by inappropriate incentives. The prevailing structures of personal reputation and career advancement mean the biggest rewards often follow the flashiest work, not the best. Those of us who follow these incentives are being entirely rational – I have followed them myself – but we do not always best serve our profession's interests, let alone those of humanity and society.

We all know what distorting incentives have done to finance and banking. The incentives my colleagues face are not huge bonuses, but the professional rewards that accompany publication in prestigious journals – chiefly Nature, Cell and Science.

These luxury journals are supposed to be the epitome of quality, publishing only the best research. Because funding and appointment panels often use place of publication as a proxy for quality of science, appearing in these titles often leads to grants and professorships. But the big journals' reputations are only partly warranted. While they publish many outstanding papers, they do not publish only outstanding papers. Neither are they the only publishers of outstanding research.

These journals aggressively curate their brands, in ways more conducive to selling subscriptions than to stimulating the most important research. Like fashion designers who create limited-edition handbags or suits, they know scarcity stokes demand, so they artificially restrict the number of papers they accept. The exclusive brands are then marketed with a gimmick called "impact factor" – a score for each journal, measuring the number of times its papers are cited by subsequent research. Better papers, the theory goes, are cited more often, so better journals boast higher scores. Yet it is a deeply flawed measure, pursuing which has become an end in itself – and is as damaging to science as the bonus culture is to banking.

It is common, and encouraged by many journals, for research to be judged by the impact factor of the journal that publishes it. But as a journal's score is an average, it says little about the quality of any individual piece of research. What is more, citation is sometimes, but not always, linked to quality. A paper can become highly cited because it is good science – or because it is eye-catching, provocative or wrong. Luxury-journal editors know this, so they accept papers that will make waves because they explore sexy subjects or make challenging claims. This influences the science that scientists do. It builds bubbles in fashionable fields where researchers can make the bold claims these journals want, while discouraging other important work, such as replication studies.
It should be noted that the problem which we see manifesting through this measure of value known as scientometrics is not restricted to the domain of science. The underlying problem is also a hot topic in the domains of both marketing and blogging/journalism, under the keyword "viral marketing". For people wanting to know more about this, I would recommend checking out one of the numerous Ryan Holiday interviews on YouTube. For instance, this one:

"The newness of a scoop is what gets the clicks. And so, the fact that someone else may have a better article, or a sort of reasoned or complicated or nuanced take on the issue, it doesn't matter. It's whose [article] transmits faster, whose transmits better. That's why everyone is always racing to be first — because whoever is first kind of always gets the lion's share of the attention. And then the other people have to be either louder or somehow more interesting, and oftentimes context — while more informative — is not more interesting than a straight ... sort of, scandalous scoop."

"The number one predictor of articles that go viral for the New York Times … is how angry the article makes the reader. When you think about it, that does make sense, right, because think about an article you read when your reaction was like, 'Oh, okay, good article.' That's not a very viral emotion. But, if your reaction was, 'I'm so pissed. I can't believe they wrote this.' You're gonna send it to 10 people, and blogs and newspapers and online publishers exploit that reaction all of the time because they'd rather have 10 page views than 1 page view … Even if one is inherently less satisfying than the other."

Notice that in each case — be it the science magazines, the executive pay issue or with journalism — the pattern is the same: A simplistic metric is introduced as a means of judging a particular endeavor, and due to the failure of that metric to accurately measure that endeavor, the quality of the activity itself becomes reduced. This is why I advocate putting a box around scientometrics. Scientometrics should be active at only the model-level view of scientific discourse, and it should be associated with a particular worldview. There is no sense at all in applying citation analysis to discourse which spans worldviews.

CharlesChandler
Re: Online scientific discourse is broken and it can be fixe

Lloyd wrote:
I wrote up a few years ago what I think is a fairly complete version of a proper Scientific Method, which I posted on the other forum and this one and again on your site, Charles. I suppose the improved sci method in this outline is related.
Yes. I'd like to see how much of that material can be integrated into this new structure. You didn't seem to like outlines, preferring instead to see things in flat lists, and we never achieved a consensus on how to proceed because of that. I'd like to encourage you to maintain your own lists. Parallel strategies aren't necessarily a bad thing — every strategy has its benefits, and it's not going to hurt us to see the same material presented in a variety of ways. It will help us learn what methods are good for what types of projects, and how to utilize the full compliment of technologies available to us.

I just have a really hard time thinking entirely of sequential lists — I'm heavily into the concept of distributed network thinking. Essentially, verbal thinking is totally sequential, but the brain in general is a massively parallel, distributed processor (MPDP). Imagine a fishnet, where the knots are concepts, and the strands between the knots are the contributing concepts. For example, "See Johnny run" is a simple concept. So what is a concept? It's a bundle of other concepts. Seeing, Johnny, and running are all concepts in their own right, and "See Johnny run" is a bundle of those concepts. The supporting concepts are themselves bundles of many things. How do you serialize that into a step-by-step description of the whole thing? To communicate verbally, we have to find a way to do this, but internally, the concepts are not stored in a sequential way. Rather, they're distributed in nodes of a network, and exploring those relationships in the native internal language of the human brain requires acknowledging the structure.

This is why flowcharting is so intuitive — it shows concepts as nodes connected to each other, sometimes in fully non-linear ways, and sometimes, that's the only way to accurately describe a complex thought pattern. Outlining is just one step above straight text, but it's fast, and it enables the clustering of related concepts.

I still believe that journal-style (i.e., stream of consciousness, time-based) verbiage is useful for capturing ideas, and for getting quick feedback. The lack of structure in it is extremely useful, because creativity isn't suppressed. The next step up is outlining. The next step is flowcharting. I think that we need to use all of these methods. I guess that the only point of disagreement is that I believe that we have to start doing something, or all of these ideas are going to get lost, as they always do in journal-style discussions. ;) So I see online discussions as workbenches, but where are the products that we are producing? ;)
Lloyd wrote:
Barbara Sher has a planning method for achieving true goals, which she calls Wishcraft. She says failing to plan is planning to fail. I've had fair results with her method and try to use it pretty often. She says it can work for groups as well as individuals, at least for any goals that the members of the group share. After determining goals, she says to plan by working backwards in action steps from the goal in the future to the present.
I agree with the goal-oriented approach. There again, I like outlining, ;) because it enables the identification of major milestones, which can be broken down into their individual strategies and procedures.
Lloyd wrote:
Goal: Improve Science and Maximally Reduce All Tendencies toward Error.
4. Public Library of Science Knowledge is established online
3. A wiki system similar to Wikipedia is built to hold all Science Knowledge
2. Efficient Process of sorting Known, Unproven and Disproven theories is established
1. A group of collaborators organizes for the above purpose
Yes! We're floundering on steps #1 and #2, but at least we've laid out a lot of material, which is a start. And it's more than just discussions — we've produced documents! :D But it needs to be more than just you & me working on it — somehow we have to come up with a framework that will support collaboration among several or many people. Individuals aren't worth much in the grand scheme of things, at least not compared to what teams can do. I'm still convinced that somewhere in here, there is a combination of technologies that can allow people to author material freely, while after-the-fact, the value can be gleaned out of it and represented in a "master document". When people start going back and forth between the discussions and the master document, seeing the value accumulate in the master document, and seeing higher quality discussions that pick up where the master document leaves off, they'll "get it", and then the pace will pick up rapidly. So this is why I've been working on the "summary" and "outline" features in QDL.

And no, pushing my software isn't an ulterior motive. ;) I'll never make any money off of that. I'm pushing it because I think that it's the right tool. Why? I thought that this was the way to go, and that's why I implemented it that way! :) So I'm not pushing it because I wrote it — I wrote it because I bought into the ideas. ;)
Lloyd wrote:
Or is it best to discuss in both locations?
I think that we should discuss it here, but refer to the structure that is emerging in the outline on my site.
Lloyd wrote:
Charles, do you want to have Reply buttons in the outline?
I shut them off inside the outline, because they were too distracting. To add a sub-topic, you have to use the "Action Icon" (which is now user-selectable).
pln2bz wrote:
What I try to do is to be extremely cautious about the largest, most important decision points. When I am unsure about how to proceed, what I generally do is expose myself to the relevant domains.
I understand that, and I agree. Jumping to conclusions typically results in lost opportunities, so it's good to be circumspect. I'm not suggesting that we lock down on anything. I'm just saying that the value needs to start accumulating somewhere, and it isn't going to do that in a thread, which is just the trail that we leave behind us. ;)
pln2bz wrote:
In this particular case, on the specific topic of visualizing science, knowledge and controversy, there is actually an enormous volume of pre-existing content which I had already intended to discuss on the related subjects of …
I'd like to suggest that you use my site as a repository for your documents. (My site can function as a wiki, as well as a forum, as well as an outlining tool, to name a few of its capabilities.) The point here is that we need to be adding value to an emerging structure. It's one thing to converse, but until a document is produced that can be criticized, and fixed, and criticized again, and fixed again, the value doesn't accumulate. In other words, the plane is definitely moving, but it isn't going to take off like that. So let's have a look at the documents you've been putting together! ;)
pln2bz wrote:
It's conceivable and even probable that there is actually more than one solution to the problem.
Absolutely! You have a great attitude, being open-minded, and not thinking in all-or-nothing terms, which tends to be the end of anything. ;)
pln2bz wrote:
There are a couple of trending stories which relate to our thread here. There is one which pertains to the failure of scientometrics to act as a useful guide for judging science...
I added this to the outline, along with other identified problems with mainstream science. If you have other articles, please let me know, and I'll add them to the outline too, so we can get the known problems laid out.

pln2bz
Re: Online scientific discourse is broken and it can be fixe

Imagine a fishnet, where the knots are concepts, and the strands between the knots are the contributing concepts. For example, "See Johnny run" is a simple concept. So what is a concept? It's a bundle of other concepts. Seeing, Johnny, and running are all concepts in their own right, and "See Johnny run" is a bundle of those concepts. The supporting concepts are themselves bundles of many things. How do you serialize that into a step-by-step description of the whole thing? To communicate verbally, we have to find a way to do this, but internally, the concepts are not stored in a sequential way. Rather, they're distributed in nodes of a network, and exploring those relationships in the native internal language of the human brain requires acknowledging the structure.
You're describing what is known in the computer science world as a graph. Graphs are conceptually quite simple to understand, but they will be unfamiliar to many programmers who might have learned how to code databases using traditional relational databases. If you look at what is happening in the CS world today, you will see a large-scale trend towards re-conceptualizing problems in terms of graphs, mainly because graphs are logarithmical faster at handling situations where there are lots of one-to-many correspondences. To do that in SQL, you'd have to do what's called a table join. And the problem with joins is that they are an ad hoc solution which do not facilitate fast searching through graph-like structures for big data solutions involving lots of servers. What programmers have been re-discovering is that many problems generally fit the graph paradigm better than the relational one. Even the typical recommendation engine that you use on Amazon cannot even be made to work on a relational database. It would be so slow as to be useless.

I highly recommend that you explore this notion of graphs to the greatest depth that your mathematical skills permit if you have an interest in this topic. I'll do my best to help, but the reason I say this is that graphs have now been successfully applied to just about all problems other than science, at this point. And yet, as you note above, graph databases are also incredibly suited to science's inherent structure. So, I can promise you that we are probably not the only ones talking about this right now.

Lloyd
Re: Online scientific discourse is broken and it can be fixe

Goal
Goal: Improve Science and Maximally Reduce All Tendencies toward Error.
4. Public Library of Science Knowledge is established online
3. A wiki system similar to Wikipedia is built to hold all Science Knowledge
2. Efficient Process of sorting Known, Unproven and Disproven theories is established
1. A group of collaborators organizes for the above purpose


CC said we're floundering on 1 & 2, but I feel that I have a good sense of how to do #2, as I discussed in an earlier post. But I agree that #1 has been a wickid problem for about a year now. At least two of us share interest in step #1 and are looking for solutions. I hope Chris, Plas and maybe others may mention interest in it too.

Sci Method - CC, you said I prefer flat lists instead of outlines and as a result we didn't get any further on developing Sci Method. You, Brant and I later had some discussion of my version of Sci Method and Brant disagreed with my original final step 7. Then I revised my steps to 5, but placed 2 of them as substeps under another step and I revised step 7, which became step 5. But I think the discussion ended there, at least so far, without further comments.

Concepts - Your explanation of concepts is interesting. Analyzing my own memory, I know that I organize concepts in lists and as parts of past experiences. I have a mental list of all the U.S. states and their capitols which I can remember either alphabetically or geographically. If I think about a particular state I can remember experiences and people associated with it. Your example of concepts in a simple sentence, See Johnny run, says each word is a concept, and you could add that each of those concepts is one of a large, if not infinite, class. Seeing can be direct, or in a mirror, or through a peephole, or mentally etc. Johnny can have any face or form, besides a boy or man, or Caucasian boy, can be any race and either sex, can even be a robot, or animal etc. Running can be with legs, in a race, or game, or in any setting, even without legs, like in a car, or can be like a refrigerator running (a refrigerator named Johnny), or a nose (if the fluid or mucus etc is given the name of Johnny), etc.

General Semantics - I red (read) General Semantics somewhat in the 1970s and it stresses for better understanding the use of low levels of abstraction. We've been using some high level abstraction here, so we have some trouble understanding each other, I think. A low level abstraction requires adding more concepts to describe a subject in enough detail to be meaningful for all involved in discussion and the audience. Time and place are among the more helpful modifying concepts. Pictures and diagrams also help clarify a subject. GS also emphasizes caution about the "is" of "identity" and confusing maps with territories. The words we're using are maps in a sense. Concepts are also maps. The territory is the real thing that exists independent of our minds. A picture of Johnny running, is not Johnny running. It's another map.

Tools
We've been discussing tools here too. We each have some expertise with different and same tools. Analysis is a tool. Our writings are tools. Lists, outlines, dialog maps, flowcharting, forums, online resources, planning methods, sci method etc are tools. And it's helpful to have a varitey of collaborators who have expertise with different tools, as with any kind of major construction etc.

Here are some tools I've found to be somewhat productive online for collaboration. 1. Forums are good for normal conversations, similar to in-person conversations, but with long breaks. 2. Google Documents worked well last year for a few months about once a week for close to 2 hours each time with 4 of us from this forum all often texting simultaneously on the same document. Charles was the first to suggest that the discussions there had stopped being very useful. 3. A few years ago I organized a worldwide simultaneous discussion with about 7 people weekly for about 2 months. The hardest part was finding a time when all could be online at the same time, although that was a bit of a problem with the Google Doc discussions too.

Tool Practice - We're still looking for a tool to help us reach our goal more efficiently. Charles appears to have a means to build an efficient tool on his site, which I think has been improving considerably. I'm hoping there will be a breakthrough soon that will result in a tool that will do the job in conjunction with some of the other tools already available. I'm willing to practice with promising potentially cutting-edge tools with anyone else who's interested.

pln2bz
Re: Online scientific discourse is broken and it can be fixe

Why Representing Scientific Discourse is Fundamentally a Graph-Style Problem,
And Why That Matters


Here's a sample social graph, much like what Facebook does …

Image


Here is an excerpt from the introduction of the O'Reilly programming text, Graph Databases:
What Is a Graph?

Formally, a graph is just a collection of vertices and edges—or, in less intimidating language, a set of nodes and the relationships that connect them. Graphs represent entities as nodes and the ways in which those entities relate to the world as relationships. This general-purpose, expressive structure allows us to model all kinds of scenarios, from the construction of a space rocket, to a system of roads, and from the supply-chain or provenance of foodstuff, to medical history for populations, and beyond.

Graphs Are Everywhere

Graphs are extremely useful in understanding a wide diversity of datasets in fields such as science, government, and business. The real world—unlike the forms-based model behind the relational database—is rich and interrelated: uniform and rule-bound in parts, exceptional and irregular in others. Once we understand graphs, we begin to see them in all sorts of places. Gartner, for example, identifies five graphs in the world of business — social, intent, consumption, interest, and mobile — and says that the ability to leverage these graphs provides a "sustainable competitive advantage."
Texts on graph theory today exist to service the entire spectrum of interest: From those with PhD's who want to learn how to mathematically analyze graphs using fuzzy logic, statistical tools, machine learning, and so on, to the more pragmatic programming texts, and all the way down to well-written introductory texts intended for layperson audiences who simply want to know what a graph is in clear terms. I've scanned what is out there, for the most part, and for the people here, I would strongly recommend …

Graphs and Applications - An Introductory Approach (the gentlest introduction of all options)
Graphs and their Uses

Here's a sample graph (from the highly recommended text, Graph Databases) which depicts a small Twitter network. It's actually somewhat self-explanatory:

Image


The text explains, perhaps unnecessarily (?):
A property graph has the following characteristics:

  • It contains nodes and relationships
  • Nodes contain properties (key-value pairs)
  • Relationships are named and directed, and always have a start and end node
  • Relationships can also contain properties
Most people find the property graph model intuitive and easy to understand. Although simple, it can be used to describe the overwhelming majority of graph use cases in ways that yield useful insights into our data.
There is a short list of reasons for why graphs are preferable for problems like scientific social networks. The two most important reasons are (again, from the O'Reilly text):
Performance

One compelling reason, then, for choosing a graph database is the sheer performance increase when dealing with connected data versus relational databases and NOSQL stores. In contrast to relational databases, where join-intensive query performance deteriorates as the dataset gets bigger, with a graph database performance tends to remain relatively constant, even as the dataset grows. This is because queries are localized to a portion of the graph. As a result, the execution time for each query is proportional only to the size of the part of the graph traversed to satisfy that query, rather than the size of the overall graph.

Flexibility

As developers and data architects we want to connect data as the domain dictates, thereby allowing structure and schema to emerge in tandem with our growing understanding of the problem space, rather than being imposed upfront, when we know least about the real shape and intricacies of the data. Graph databases address this want directly. As we show in Chapter 3, the graph data model expresses and accommodates business needs in a way that enables IT to move at the speed of business.

Graphs are naturally additive, meaning we can add new kinds of relationships, new nodes, and new subgraphs to an existing structure without disturbing existing queries and application functionality. These things have generally positive implications for developer productivity and project risk. Because of the graph model's flexibility, we don't have to model our domain in exhaustive detail ahead of time—a practice that is all but foolhardy in the face of changing business requirements. The additive nature of graphs also means we tend to perform fewer migrations, thereby reducing maintenance overhead and risk.
It's important that as a coder who is interested in creating a system which might be used by many thousands or even millions of users, you know that the technology you're using is scalable. The worst-case scenario is that a company is blowing up in the news, users are piling on by the tens of thousands per day, and your site is becoming sluggish even as you add additional servers. This is the wrong time to realize that you've chosen the wrong technology. The fact is this:

The scientific social network problem is a graph type of problem. It cannot be solved with a traditional table-based database. Anybody who thinks that they can solve this problem without learning about graphs needs to take a step back, and rethink the way in which they are approaching this problem — because, without graphs, you would be heading down a path for absolute failure on this particular problem, in the event that you, by chance, happened to create something which became used by lots of people.

For Those Who Remain Curious, Here's Why
(The Rest Can Skip Ahead)


From Graph Databases:
Relational Databases Lack Relationships

For several decades, developers have tried to accommodate connected, semi-structured datasets inside relational databases. But whereas relational databases were initially designed to codify paper forms and tabular structures—something they do exceedingly well—they struggle when attempting to model the ad hoc, exceptional relationships that crop up in the real world. Ironically, relational databases deal poorly with relationships.

Relationships do exist in the vernacular of relational databases, but only as a means of joining tables. In our discussion of connected data in the previous chapter, we mentioned we often need to disambiguate the semantics of the relationships that connect entities, as well as qualify their weight or strength. Relational relations do nothing of the sort. Worse still, as outlier data multiplies, and the overall structure of the dataset becomes more complex and less uniform, the relational model becomes burdened with large join tables, sparsely populated rows, and lots of null checking logic. The rise in connectedness translates in the relational world into increased joins, which impede performance and make it difficult for us to evolve an existing database in response to changing business needs.

Figure 2-1 shows a relational schema for storing customer orders in a customer-centric, transactional application.
Image
The application exerts a tremendous influence over the design of this schema, making some queries very easy, and others more difficult:

  • Join tables add accidental complexity; they mix business data with foreign key metadata.
  • Foreign key constraints add additional development and maintenance overhead just to make the database work.
  • Sparse tables with nullable columns require special checking in code, despite the presence of a schema.
  • Several expensive joins are needed just to discover what a customer bought.
  • Reciprocal queries are even more costly. "What products did a customer buy?" is relatively cheap compared to "which customers bought this product?", which is the basis of recommendation systems. We could introduce an index, but even with an index, recursive questions such as "which customers bought this product who also bought that product?" quickly become prohibitively expensive as the degree of recursion increases.

Relational databases struggle with highly connected domains.
The text goes on to showcase a couple of concrete examples. And what generally happens that is so catastrophic to this relational database type of solution is that to get at the answer for certain database queries, every single entry in a particular table must be scanned by the system. This problem never occurs in graph databases, because there are no table structures exerting a drag upon our algorithms.

It should be apparent that the inherent structure of science — the concepts, propositions, models and worldviews, as well as the discourse which connects them — is fundamentally more like a product recommendation engine (where the algorithms are routinely confronted with one-to-many relationships like people connected by a book purchase) than the more conventional table-based relational database that was originally intended to deal with one-to-one relationships (think price sheet).

Now, recall how I've been adamant that scientific discourse should be broken down into the different domains. Graph databases are designed to handle such scenarios without any trouble at all, because nodes of different types can connect with one another. There's no need to match the node types (the concepts, propositions, models and worldviews). What this permits us to do is to — for example — refer to any concept, proposition, model or worldview from any other concept, proposition, model or worldview. Notice how the following multiple-domain graph does just that:

Image


The three domains in that graph are differentiated by line style (Note the dotted, dashed and solid lines). What I invite people here to do is to think carefully about why this might be useful for us: People tend to engage science at a particular level of the discourse at one single instant. Our minds do not generally engage science at all levels simultaneously. So, to reduce categorical confusion, my suggestion is that there is incredible value to having the ability to encode the different components of the discourse according to the type of contribution. What this ultimately does is give the user options in how to view the discourse. If a person doesn't want to think about questioning assumptions at a particular moment, then they can simply turn the worldview level of discourse off. Having that ability would represent a unique feature for this website which I feel would better support discussions of the Electric Universe, as well as other ideas associated with the NPA.

Now, if you go back and review what I said the other day, you'll see that I took this even a step further, by linking intent & values to each of these domains. This might seem arbitrary to some people, and it may turn out in due time that this decision introduces more problems than it solves. I honestly don't know if that idea will stand the test of time yet, but please realize that the reason I did that is to SIMPLIFY this whole process for users who are so new to science that they don't yet understand the actual structure of science. For those people, they can simply think in terms of learning (concepts), asking (propositions), creating (models) or debating (worldviews). My hope is that people will think deeply about the vague correspondence that appears to exist between these two very different sets of domains (intention and level of discourse), because having a system of weakly interacting communications channels provides for an ability to screen out that which a person is not currently focused upon.

Also notice how thinking in graphs tends to naturally break the problem down between representation and visualization. We can actually have two very distinct conversations here — one about how to represent and the other on how to visualize the discourse.

Plasmatic
Re: Online scientific discourse is broken and it can be fixe

Ironically as this thread is getting more "all over the map", my interests in contribution wanes.... That is, the more points get ignored, or buried, the less the analyses sides towards a benefit! (I said that just to annoy anti-value based epistemologies that promote "need" worship ;) )

Anyway, Chris, I recommend you read this paper:

Stove's Discovery of the Worst Argument in the World.

http://web.maths.unsw.edu.au/~jim/worst.html
"Let us consider another Gem-laden lode, post-Kuhnian philosophy of science. The replacement of the logic and philosophy of science by its history certainly raises the suspicion that there is a `Worst argument' at the bottom of it. Stove writes:

The Kuhnian is scandalized if you call a current scientific paradigm `true' or an earlier one `false,' or if you say that the later one is `probably nearer the truth' than the earlier. Paradigms are incommensurable, he tells you, and no special authority attaches to one which governs a field of science now. And why must we accept this astounding and sordid democracy of paradigms? Why, just because, in any field, even the best scientific knowledge which is current now, or at any time, is always rigidly constrained within the limits imposed, by the paradigm prevailing at the time, on scientific knowledge. (Stove, 1991, 168)

It is not clear how accurately this represents Kuhn himself. Partly, this is because he just said, `Let's do history, as it is so much more exciting than boring old logic.' He does, it is true, state conclusions that seem to require such an argument, such as `There is, I think, no theory-independent way to reconstruct phrases like "really there"; the notion of a match between the ontology of a theory and its "real" counterpart in nature now strikes me as illusive in principle. Besides, as a historian, I am impressed with the implausibility of the view.' (Kuhn, 1970, 206-7; discussion in McGrew, 1994). But no argument is included. His followers have made up the slack, especially those in the `Strong Program in the Sociology of Knowledge' or social constructivism, like Bloor. They propose to replace all considerations of logic, of what scientific theories are reasonable, with considerations of sociology, that is, of what interests theories serve. The real reason for their views is their conviction that since science is done by people, its explanation should be in the realm of causes acting on people, not the realm of abstract reasons. People, they think, can be acted on by their interests, or patronage, or the social milieu, but abstract facts like 2 + 2 = 4 do not act. So explanations of how people, including scientists, think ought to be sociological. This argument appears in various forms, mostly not very explicit ones. Thus, Bloor argues that observation `underdetermines' theory – that is, that several theories are logically compatible with any given body of observations – and concludes immediately that it must be social factors that determine which theory is chosen. (Bloor, 1976; Bloor, 1991, 171-2) He says that the `existence of nature' does not account for (scientific) theories and that simple `attention to nature' will not adjudicate the merits of our theories. (Barnes, Bloor and Henry, 1996, 48) He reserves particular anger for the opinion that belief in rea onable theories is at least in part explained by their being reasonable, while mistakes require causal explanations; Bloor says sarcastically that this is an attempt to render science `safe from the indignity of empirical explanation.' (Bloor, 1976, 7, 5) It must be emphasised that Bloor does not admit any possibility of co-operation between causes and reasons: explanation in terms of causes is quite different to that in terms of reasons, he says; if one is right, the other is wrong. (Bloor, 1976, 9)

This argument, the central plank of the social constructivist position, is a version of Stove's `Worst Argument' because it says: `We can know things only via causal (social) processes acting on the brains of real scientists, therefore, the content of our theories is explained without remainder by the social factors causing them; that is, we cannot know things as they are in themselves.' This is why no amount of raging about relativism, scepticism and truth is found to make any impact on constructivists. They have a last line of defence in the argument: `Those entities in Platonic worlds, like truths and theories, cannot cause belief in themselves. Scientists are people, after all, and as such are responsive only to social or similar causes.'

Like all such arguments, Bloor's says, in effect, that the mere fact that a theory is accepted is a reason for not accepting it."

Plasmatic
Re: Online scientific discourse is broken and it can be fixe

Chris your last post motivated me to figure out how to post a concept map here:

http://img689.imageshack.us/img689/2911/s48p.jpg

Edit: can someone turn that link into a image here?

Lloyd
Re: Online scientific discourse is broken and it can be fixe

Plasmatic's Graph (or Concept Map)
Image

CharlesChandler
Re: Online scientific discourse is broken and it can be fixe

pln2bz wrote:
Graphs are logarithmically faster at handling situations where there are lots of one-to-many correspondences.
Yes, and that's why I'm using that data model. In QDL, each node can contain text, images, and links to other nodes. So once you're grabbed the node, you don't have to go back to the database to find out what it's related to. And yes, this scales quite well. And since my software is online, the "reports" are all real-time, so I had to implement high-performance code throughout the app.
pln2bz wrote:
I highly recommend that you explore this notion of graphs to the greatest depth that your mathematical skills permit if you have an interest in this topic.
I have my hands full already with QDL.
Lloyd wrote:
I feel that I have a good sense of how to do #2 [Efficient Process of sorting Known, Unproven and Disproven theories is established], as I discussed in an earlier post.
Is there any chance that you would be willing to work within the Astronomy / Outline for the evaluation of astronomy theories, and the Scientific Process / Outline for process-related ideas? As I keep saying, these ideas need to start accumulating somewhere, and the format needs to support easy navigation of large amounts of material, and ongoing edits as people see places where they can contribute. I feel that these expandable outlines satisfy those requirements. This might not be the ultimate technology for this, but we need to get up above the level of just carrying on forum discussions, in order for the value to start accumulating.

CharlesChandler
Re: Online scientific discourse is broken and it can be fixe

Speaking of graphics, I hooked up an SVG editor on my site. So now you can do drawings, graphs, etc. That's in addition to the WYSIWYG HTML editing, and the proper display of a number of other text file formats (CSV, EML, JS, PAS, PHP, RIS, RSS, RTF, SQL, TAB, TXT, TeX, VCF, and XML).

Lloyd
Re: Online scientific discourse is broken and it can be fixe

Charles, I looked over the two places you linked to above and I tried to contribute a little, but I don't know if I did so in the way you hoped. If not, it may help to explain in more detail what you'd like, or which specific topic you may prefer starting with.

I see that you're trying to record all the good ideas you find here and elsewhere on your site and organize them and put some of them into use.

← PREV Powered by Quick Disclosure Lite
© 2010~2021 SCS-INC.US
NEXT →