Distinguishing Encephalitis from Encephalopathy


Encephalitis may be defined as infection or inflammation of the brain substance, resulting typically in disturbed sensorium and perhaps seizures or focal neurological deficits and sometimes pyrexia. Encephalopathy in the other hand represents disturbed sensorium not due to an infective or inflammatory cerebral process and its causes range from toxins and drugs to metabolic upset, non cerebral sepsis, cerebral hypoperfusion and post-ictal states.

Distinguishing the two is important because considerable morbidity and mortality is associated with delayed treatment with appropriate antiviral, antibiotic or immune therapy for encephalitis and with delayed treatment of the various causes of other causes of encephalopathy.

The paper presented, “To what extent can clinical characteristics be used to distinguish encephalitis from encephalopathy of other causes? Results from a prospective observational study” by Else Quist-Paulsen et al., attempts to use clinical and rapidly available investigatory findings to distinguish the two conditions by a prospective observational study on 136 patients.

They identified candidate patients on the basis they had a lumbar puncture, and then excluded those with no evidence of encephalopathy. Their criteria for encephalitis were:

  • Pyrexia
  • Encephalopathy > 24 hours with no other cause identified and 2 of:
  • CSF WCC >=5 x 106/l
  • New onset seizures
  • New onset focal neurological findings
  • CT/MRI consistent with encephalitis
  • EEG consistent with encephalitis

The gold standard by which to gauge their test would surely be a definitive diagnosis but, as is commonly the case in clinically suspected encephalitis, such a diagnosis was only made in 10 of 19 patients. In some of the patients with non-encephalitis encephalopathy, the diagnosis was also vague, e.g. “aseptic meningitis” (which could be encephalitis), “epilepsy” (which could be autoimmune encephalitis), “headache/migraine”, “unspecified disorientation or coma”.

Subsequent analysis of specific features in the two groups then becomes somewhat difficult because the criteria themselves become the gold standard and because some specific features were in themselves their criteria. Interestingly, systemic features of infection such as raised blood white cells or CRP, argued against encephalitis because general sepsis was a common cause of encephalopathy. Nausea and personality change were more common in their encephalitis group.

They used ROC curves to look at the predictive value of these specific features and their combinations, but these were again based against their “testing variable”, their criteria, not on some objective gold standard. It would have been better to look at them only in the 10 diagnosed cases rather than all 19, but then the total number of cases would be even lower.

The diversity of diagnosis of their cases was interesting, especially that Lyme disease and TB were as common as VZV and more common than HSV. Only one of their cases had NMDA receptor antibodies, but we do not know that all the patients had this test and a full battery of other autoimmune antibody tests. Many might have been put in the encephalopathy with seizures category. Since encephalitis can be associated with meningism, some “aseptic meningitis” patients might have been viral but with negative testing, or even autoimmune with a migrainous headache and stiff neck.

The group felt that the study was very worthwhile but a more clear guide as to which cases of encephalitis warranted antimicrobial therapy  or immune therapy would be the clear goal. This would require clarity on the gold standard diagnosis and many more patients.

The Journal Club discussion on which this post is based was presented by Dr Aram Aslanyan, Specialist Registrar in Neurology at Queens Hospital, Romford, Essex.

Posted in Infectious Diseases, Inflammatory/ Auto-Immune Diseases | Tagged , | Leave a comment

Making a Differential Diagnosis using a Clinical Database


A great deal of time is spent in medicine reading and writing case reports. Essentially, clinical features are listed and a diagnosis made. Excluding those cases that point to a novel means of treatment, a case report is often noteworthy simply because the diagnosis is rare, or because the clinical features were most un-likely to be associated with the diagnosis. This hardly seems a reliable method of archiving medical knowledge.

Much less time is spent on attempting a method of diagnosis that is more systematic than the recalling of case reports. One can see that if one did wish to move medical diagnosis into the information age, natural instinct would be to use an internet search engine to enter a list of clinical features and see what disease diagnoses were associated with these terms. Unfortunately, internet search engines concern themselves only with the popularity of search terms and because of the dominance of case reports such practice may be likely to throw up the least likely cause of those features, or that which is most “titillating” to those who most perform internet searches.

There have been attempts to provide a more balanced means of linking clinical features with diseases and hence making clinical diagnoses. Rare disease with a large number of different clinical features are least easily diagnosed by clinical experience or key investigations, and so the focus of these attempts has been on rare genetic diseases using ever-expanding databases such as Orphanet, Online Mendelian inheritance in Man (OMIM) and the London Dysmorphology Database and the Pictures of Standard Syndromes and Undiagnosed Malformations (POSSUM).

One method of searching for clinical features on these databases is simple text matching. A way of quantifying the match is the feature vector method, which calculates the mathematical overlap between the Query (the clinical features of the case) and the Disease (the clinical features of the disease). A vector of the query is calculated with dimensions for each feature and a value of 1 if present and 0 if absent. The same is done for the disease. The dot product of the two vectors is the strength of the match (a 1 for both query and disease will sway the two vectors in a common direction, and a 0 for both will leave their relationship unchanged, while a 0 and a 1 will make one move away from the other).

A potentially better quantification of matching is to take into account the different specificities of different clinical features. If a clinical feature is present in only a few diseases, its annotation (the linkage of a clinical feature to a disease) is more specific for that disease (in database terms this is called the information content (IC)) and so that linkage should have more weighting. The IC is simply the negative log of the frequency of the annotation. For example, AV block is a term that annotates 3 diseases in the 4813-disease OMIM database. The frequency is 3/4813. Loge of this is -7.38 and minus loge of this is 7.38. A much more general term will have many annotations and a much lower negative log, tending towards zero. The ICs of all the clinical features of the query can be summed or otherwise combined to provide an overall match.

The authors of the presented paper have described a further refinement of this method. This is called the Ontology Similarity Search (OSS). Instead of simply matching the text of terms, they fit clinical features into a standardised language within an ontological framework. This means that the features are related to one another in a hierarchy, with more general terms higher in the hierarchy and more specific subcategories of those general terms lower in the hierarchy. While “parent” terms obviously have many “child” terms, child terms can also belong to multiple parent terms. For example, optic atrophy could be a child of demyelinating disease and also a child of visual disturbance. Their ontology is called the Human Phenotype Ontology (HPO) and has around 9000 terms.

The advantage of using the ontology is that if a clinical feature of a case does not fit the clinical features of the disease, but shares a parent term with one of the features of the disease, instead of scoring a zero match, this scores as a match but less so than if the match was with the specific terms. The method specifically find the most informative common ancestor of the two different clinical features, and uses the IC of that term. Being a more general term, it will be a feature of more diseases and so have a lower IC. (In the database, ancestor terms are implicitly annotated when child terms are annotated.) The overall strength of match is the average of all the ICs – there will always be a IC for each feature, even if it is just that they are both a feature of “any disease”, which of course has an IC of zero and would bring down the average.

Summary of the Paper

The presented paper, Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies by Köhler et al. (Am J Hum Genet. 2009 Oct 9; 85(4): 457–464), describes a further refinement of the method using a statistical treatment. For a given disease, if random clinical features from the HPO were selected one would expect a lower OSS score than for a patient who actually had the disease. If the OSS for random features were repeated many times, a distribution would be created and so one could then look at the real patient OSS and determine a p-value on this distribution. If the real OSS was higher than 95% of the random OSS scores, the p-value would be lower than 0.05 and indicate a likely match. Furthermore, if the same features were compared with different diseases and their random OSS distributions, a ranking of the likelihood of diseases could be determined by ranking the corresponding p-values. They call this the OSS –PV.

Since they considered it too onerous to enter, within the framework of the terms of the HPO, the clinical features of real patients with known diseases, they used simulated patients. This was done for 44 diseases, where they created a “patient” having a disease with a selection of the clinical features of the disease weighted by how commonly those features were found in that disease. For each disease 100 patients were created, so if from the clinical literature a feature is found in 1% of cases with the disease, 1 of the 100 simulated patients would have that feature.

They added “noise” to the process by adding to the patients some random features that were not part of the disease, and “imprecision” to the process by replacing some features with their HPO parent terms.

Then they looked at the rank position of the true disease among all the 5000 or so database diseases found by the different methods. The closer the rank position to the true position (first!), the better the method performed.

Unsurprisingly, the performance of the feature vector method, as shown by box plots of rankings for all 44 diseases tested, was found to suffer when imprecise terms were used, because that was the point of using the ontological system. The OSS-PV method more modestly outperformed the raw OSS method when noise and imprecision were added.

As the authors point out, the OSS method potentially suffers from the fact that it only matches query terms with disease terms. If a disease also had many terms that did not match the query terms, surely the overall match would be less specific. This can be taken into account by performing a symmetrical similarity search, where the OSS is the average of the matches of the query to the disease and the matches of the disease to the query. However, they did not use this method in their presented data, only stating that when they used it the symmetrical OSS-PV still significantly outperformed the feature-vector method. They do not state that it still outperforms the symmetrical raw OSS.

Another point raised by the paper is that if one finds on a disease search that no disease fits the features with a p-value less than 0.05, exploration could be made of other clinical features, or child features of the entered clinical features that would have a higher information content and provide a more significant match. Going back and looking for a specific feature, or performing a specific investigation, would be an example of this.

Journal Discussion

As described in the introduction, any attempt to quantify and rationalise differential diagnosis should be lauded and this paper clearly describes progressive refinements of this process. It is almost negligent to have all the data available on thousands of diseases and not to use them because the unaided human mind simply cannot store so much information.

However, a number of further refinements and limitations present themselves.

First, the matching of terms is still semantic rather that systematic. While a knowledge-based approach, it nevertheless does not rely on understanding of disease pathophysiologies and pathognomonic features. Some clinical features that share a close parent may in fact best distinguish diseases rather than be considered loosely positively associated features. This may apply particularly in neurology where there is a more systematic approach. For example, upper motoneurone lesion and lower motoneurone lesion may be considered together and share a common parent in “motor neurone lesion”, but apart from the case of motoneurone disease, they split the differential diagnosis more than upper motoneurone lesion and no motor lesion at all. They are semantically similar but nosologically opposite. Horizontal supranuclear gaze palsy and vertical supranuclear gaze palsy may share a strong information content parent, but may be the feature that best separates Gaucher disease from Nieman Pick disease.

This leads to the second point. The frequency, or sensitivity, of a clinical feature in a disease is not considered, although ironically considered when creating the simulated patients with the 44 tested diseases. In large part this reflects the lack of clinical data in the databases themselves. It is regrettable that case reports are not combined into case series which contain information on the frequencies of occurrence of clinical features, or when there are case series, these data are not actually collected systematically. If a clinical feature occurs in 1% of cases of one disease and 100% of cases of another disease, clearly the annotation of the feature for the second disease should be considered far stronger than for the first. Instead, because there are no such data, they are given equal weight; the weighting only considers whether or not the feature is also found in a number of other diseases, not how commonly it is found in those diseases.

There is no consideration of how common the disease is in the first place. While restricting themselves to rare and genetic diseases by definition, there can be a frustrating tendency for searches to throw up the least likely diagnosis. It is often the case in practice that the clinician does not know in advance that the patient has a rare genetic disease, and a diagnostic tool should be most useful to those with least intimate knowledge of the database. Thus, when entering the features dystonia, spastic hemiparesis and spastic dysarthria in a case of cerebral palsy, it comes as a surprise when the top diagnosis is cleft palate-lateral synechia syndrome.

Finally, the methods assume that clinical features are independent. In fact, many clinical features are strongly interdependent; they especially occur together. The association of the second feature is not really very additionally informative if the first is present. This problem would be common to most forms of differential diagnosis calculators, including those using Baysian methods, and could only be solved if there were data on the interdependence of clinical features in different diseases; currently it is hard to find even raw frequency data for most diseases.

The point that the authors raise about using their App to find features that would be more specific in making a diagnosis is an interesting one, and opens a new approach to diagnosis and refinement of the process of often expensive and sometimes risk-associated investigation. One could imagine the improvements in medical care that would arise from use of an App that gave a differential diagnosis based on initial clinical information and then showed the relative power of different investigations in narrowing that differential.

A further use of these methods would be in creating diagnostic criteria. While clinical practice is rightly focused on the most likely diagnosis in a patient, clinical research is focused on a group of patients where the diagnosis is certain, i.e. specificity at the expense of sensitivity. Currently, diagnostic criteria seem to be set largely by “workshops” – gatherings of the great and the good usually in an exotic location who draw up a list of features, create two categories of importance and then decide how many features are required for a “definite diagnosis”. Using a quantified method such as that described in this paper for every study patient and including only patients where the diagnosis reaches a threshold p-value score would seem to be a far more reliable method.

The paper on which this journal club article is based was presented by Dr John McAuley, Consultant and Honorary Senior Lecturer in Neurology at Queens Hospital, Romford.

Posted in Genetics | Tagged , , , | Leave a comment


Coronavirus is obviously not a neurological disease, apart from an isolated case report of encephalitis associated with the condition, which is to be expected very rarely in association with viral infections, but because it is so topical this paper Clinical Characterisics of Coronavirus Disease 2019 in China, published in haste in the New England Journal of Medicine on 3rd March, 2020, was nevertheless presented.


A novel enveloped RNA virus of coronavirus type, similar to SARS coronavirus, was first identified as causing viral pneumonia in early December 2019 and named as Covid-19 disease. It is believed to have first been transmitted through livestock in a large market in Wuhan, Hubei province. It is thought in general that such viruses are endemic in wildlife, such as in bats, and mutate to become transmissible to other animals and to humans.

As of Friday 6th March, there were 100,645 confirmed cases worldwide, and 3411 deaths linked to the virus. There were 55,753 cases who had recovered. In Hubei province, for the first day since the outbreak no new cases had been reported.

The details are unclear, but the fact that the UK government is said to be moving from a containment to a delay phase suggests that at least some UK cases have been identified that appear to have had no contact with potential suffers in China, Iran, Italy or other hotspots, nor with other UK individuals known to have the disease.

Journal Club Article

The paper discussed is an early report focusing on numbers affected, initial outcomes and clinical presentation. It was approved by the Chinese authorities.

Data were sourced from records of laboratory confirmed cases using assay of nasal and pharyngeal swabs between 11th December 2019 and 29th January 2020. Certain hospitals were sampled, so by no means were data collected from all cases.  In all, 14.2% of all known hospitalised cases were included in the study. It is not clear how widespread was the screening of the population by these laboratory tests; all the patients in this study were hospitalised.

26% of these cases had not had contact with Wuhan residents, indicating widespread serial human to human transmission.

Clinical information is as follows:

  • Incubation period (presumably from ascertaining likely time of exposure was median 4 days (2 to 7 days interquartile).
  • Fever in only 44% on admission, but developed later.
  • Cough in 68%.
  • Viraemic symptoms occurred in some patients, but upper respiratory tract symptoms, lymphadenopathy and rash were very rare.
  • CT chest abnormalities were very common (86%) in both mild and severe cases.
  • Lymphopaenia was common (83%).
  • Only 1% of cases were under 15 years old.

Of these hospitalised cases, 926 were considered mild and 173 severe. The main factors predicting this were advanced age and comorbid disease (especially coronary heart disease, diabetes, COPD and hypertension), also breathlessness at 38% versus 15% (unsurprisingly as this would be a criterion for severity). Similarly, inflammatory markers and markers of multi-organ involvement were associated with more severe disease. The main complicating feature of severe cases was acute respiratory distress syndrome, occurring in 16%.

The outcomes were 25% risk in severe cases of intensive care admission, mechanical ventilation or death (8%). Only 0.1% of cases categorised as non-severe died. The overall death rate was 1.4%. The national statistics at the time had a death rate of 3.2%.

By the data cut-off point, 95% of mild cases and 89% of severe cases were still hospitalised; the median lengths of hospital stay were 11 and 13 days respectively. Perhaps mild cases were hospitalised for purposes of isolation.

Journal Club Discussion

The paper reports likely ascertainment bias from milder cases not being tested. Nevertheless, the scale of the morbidity and mortality of the disease is not underestimated. Ascertainment bias becomes more relevant if one expects a pandemic and most of the population to become exposed. By these means the population risk can be inferred.

The paper also reports the fact that many patients were still in hospital, and perhaps very unwell, by the study’s end point. In the study, the number of cases requiring intensive care treatment is three times the death rate. Perhaps the death rate of already infected cases may climb. On the other hand, ARDS, the major serious complication of coronavirus infection, has a mortality of around 40%, and since 16% had this condition and 8% died, perhaps few more would be expected to die.

There does appear to be an opportunity for more information to be gleaned from these data or similar studies. The large number of cases could be randomised to have treatments not clear to be effective, such as oseltamivir, steroids and intravenous immunoglobulin. Less than half of cases had these treatments, but nevertheless appreciable numbers. It would have been helpful to know the death rates for patients who did or did not have these treatments rather than only the end point rates, as in reality some of these treatments might be most relevant when patients have already reached the ITU admission end point.

A follow up study would give better indicators of important epidemiological issues such as ultimate death rates and morbidity, the possibility of reinfection versus lasting immunity and any signs that more recently infected cases, where transmission has been via several human hosts, have any milder disease than those directly exposed to the transmitting animals.

A population based study that tested all individuals in high risk areas would determine the likely proportion of individuals who have been infected but not become very symptomatic.

Worldwide, we would also want to know how ambient temperature and sunlight levels affect transmissibility.

One suspects that epidemiologists in charge of advising governments have more information than is released to the public, and various advanced tools to model infection spread, but from the recent explosion of cases in Italy and now elsewhere, where talk is of delay rather than containment, there is little confidence that the slowing up of cases in China is going to be replicated worldwide.

From the death rates reported in Italy, there appears to be no clear evidence that the disease is becoming milder, but from the delay of many days from exposure to developing critical illness, perhaps it is too early to tell.

The lack of cases in hot or southern hemisphere countries would suggest a seasonal effect of the virus, and some reassurance to northern hemisphere countries approaching Spring. But in Australia there were already 40 cases confirmed by 4th  March and at least three cases had had no recent foreign travel and no traceable contact.

It seems that one scenario for the UK is that the infection eventually replicates that of Hubei province, which has a similar population to the UK and had around 11,000 cases with few new cases to come, and with around a 1-3% mortality rate, mainly in the elderly and infirm for whom ‘flu’ is also a significant source of mortality. With around 20% of cases classed as severe, this would require an extra 2000 of some form of high dependency inpatient beds for several days and spread over only a month or two.

However, we do not have an explanation for the slowing of new infection rates in China. It could be that most of the local population has already been exposed and most were resistant to severe symptoms, or it could be that containment measures have been very effective. If the latter is the explanation and is in reality only delaying inevitable spread through the population, or if containment is not replicated to the same degree in Western countries and if there is no seasonal dip in transmission, one could imagine hundreds of thousands of cases in the UK spread over the next year. And with a current mortality rate seemingly up to 3% this is unlikely to drop when there are insufficient hospital resources to manage such numbers.

The paper on which this journal club article is based was presented by Dr Bina Patel, Specialist Registrar in Neurology at Queens Hospital, Romford.

Posted in Infectious Diseases | Tagged , , | Leave a comment

Anticonvulsant Medications for Status Epilepticus

Status epilepticus is a medical emergency with significant morbidity and mortality and, in circumstances where benzodiazepines alone have failed to terminate seizures, has traditionally been treated with anticonvulsants such as phenytoin or phenobarbitone. Other intravenously administered antiepileptics have also been found to be effective.

There is a lack of comparative data on different anticonvulsants and this blinded prospective study “Randomised Trial of Three Anticonvulsant Medications for Status Epilepticus” by Kapur et al. (2019) compares three options: fosphenytoin (a pro drug of phenytoin which is more expensive but more soluble and can be given intravenously faster with fewer extravasation problems and can also be given intramuscularly), valproate and levetiracetam.

Study Details

Patients in the study had to be over 2 years of age, and had to have convulsive status (persistent or recurrent convulsions) for at least 5 minutes, and then more convulsions between 5-30 minutes after an adequate dose of benzodiazepine (5 minutes to have allowed the benzodiazepines to work and less than 30 minutes, after which point another dose of benzodiazepines could have been tried instead). Patients were randomised by stratifying for age.

Patients with major trauma or anoxia, etc., were excluded, as were pregnant women (give levetiracetam and consider magnesium).

The doses of the intravenous anticonvulsants levetiracetam (60 mg/kg) and valproate (40 mg/kg) seemed very high.

The primary successful outcome was absence of clinical seizure activity and improved responsiveness at 60 min after infusion start.

Analysis was based on assuming equal prior probability of success for the three treatments, then using the binomial probability of positive or negative outcome to calculate the posterior probabilities. An iterative method was then used from these three separate probabilities to calculate the probability that a given treatment was better than the other two, or worse than the other two.

The sample size was set on the basis of correctly identifying with 90% probability a difference when one treatment was 15% better than the other two (65% response for the best and 50% response for the other two).

A total of 400 patients were enrolled. The intention to treat population was only 384 because some patients were enrolled more than once. Nearly a third of patients were then excluded because treatment did not follow the protocol, e.g. not status epilepticus such as functional seizures, did not receive the correct amount of benzodiazepine or anticonvulsant or wrong timing with respect to benzodiazepine.

Half the patients were unblinded to avoid suboptimal management.

In the per-protocol population, 47% of patients responded to each of the three treatments, with probability of most effective treatment distributed as follows: levetiracetam (0.34), fosphenytoin (0.35), valproate (0.31). There was also an “adjudicated population” outcome, which was perhaps based on an adjudicator clinician looking retrospectively at the notes, whether following the protocol or having had previous treatment or not, and deciding if the treatment worked. Although the data were similar, it did seem that levetiracetam may have been worse (0.51 versus 0.29 and 0.2) and clearly 0.51 is 31% worse than 0.2 (valproate), which is more than their threshold of meaningful difference of 15% for best treatment.

Secondary outcomes included requirement for admission to ICU (87% for levetiracetam and only 71% for valproate).

Regarding safety, there were 4.7% deaths in the levetiracetam group and 1.6% in the valproate group, with fosphenytoin in the middle. Hypotension, a known issue with phenytoin was 3.2% in the fosphenytion group to a life-threatening degree and only 0.7% for levetiractem and 1.6% for valproate. Cardiac arrhythmia only occurred in one patient. Acute respiratory depression occurred in 12.8 % with fosphenytoin and 8% with levetiracetam and valproate. None of these differences reached significance.

The conclusion was that there was no difference between the drugs.

Journal Club Discussion

The study was welcome as it was on an important practical topic. The group wondered about the high doses used, and whether our own guidelines should reflect these doses. The trial was powered for the primary efficacy outcome and then stopped. However it was always going to be as likely that any differences between the drugs wold lie in their side effects as in their efficacy and it is a shame that the powering did not reflect this so that what may have been real differences in respiratory depression or hypotension never reached significance.

The vagaries of statistics are illustrated by the per-protocol efficacies, which seem identical, and the adjudicator population efficacies, where there was actually a 31% greater chance of levetiracetam being the worst drug compared to valproate.

Negative study results always make us turn to how the study was powered: were there no differences seen because there are no differences, or because too few patients were studied (i.e. too low power)? When powering a study, a judgement must always be made on what level of difference would be considered meaningful, otherwise if accepting any difference as being meaningful it would require an infinite population to prove there is no difference. They chose a meaningful 15% difference for one drug being better than the other two, but if they had chosen one drug worse than the other two, the 31% difference in the adjudicator population would have been more than their set level. There should have been more explanation of their adjudicator population, and perhaps more explanation of the advantage of using Baysian probabilities in addition to a simple comparison of means and standard errors of success rates.

In real practice, there should perhaps be tailoring of treatment to the patient. If a patient is already on therapeutic levels of phenytoin, is more of the same going to be the best choice? If a patient is a female of child bearing potential, is valproate the best choice when the patient often ends up on the oral equivalent of the status treatment they received. On reviewing the data in this study and knowing that the levetiracetam dose was very high, valproate might shade the other two choices, especially in men.

The Journal Club on which this article is based was presented by Dr Katie Yoganathan, SpR in Neurology at Queens Hospital, Romford.

Posted in Epilepsy, Intensive Care Neurology | Tagged , , , , , , | Leave a comment

Galcanezumab in Chronic Migraine

Migraine is one of the most common neurological conditions, and chronic migraine is a condition that, while less common than episodic migraine, is nevertheless a major cause of loss of quality of life in otherwise well individuals.

Once analgesia headache has been effectively treated, and tension type headache excluded, chronic migraine is treated with migraine preventative medications, often very effectively. However there are a proportion of patients who remain resistant to single or combination preventative treatments.

A novel target for migraine treatment is the calcitonin gene related peptide CGRP receptor on the smooth muscle of blood vessels in the head. CGRP is released from trigeminal ganglion efferents to the blood vessels to cause potent vasodilation as part of the trigeminovascular response (analogous to the “triple response” of pain, redness and swelling of skin inflammation). Blocking this may therefore block this response. Monoclonal antibodies raised against the receptor, or against CGRP itself, have been explored as migraine treatments.

This study describes a double blind trial on galcanesumab, one such monoclonal antibody targeting CGRP. The paper does not discuss the relative hypothetical or actual benefits versus other monoclonal Ab migraine therapies already marketed or in development.

Study Design

Around 270 patients were given each of two doses of galcanezumab by monthly subcutaneous injection, and 560 were given normal saline placebo. To be enrolled on the study, patients had to have 15+ headache days per month, at least 8 of which had to be migraine days. They needed at least 1 headache free day per month. If a patient failed >3 other preventatives, they were excluded. Before the study, patients had to stop all their existing migraine preventatives except propranolol or topiramate at least 30 days before study start.

Migraine days were defined as >30 minutes of migraine or probable migraine according to ICHD-3 beta criteria (even though the duration criterion of the latter is 4+ hours). If a patient thought it was a migraine and it did not satisfy the criteria but responded to a triptan, that also counted as a migraine day.

Over 90% of patients completed the study. Only 15% of patients were on topiramate or propranolol (not specified if this was the same proportion in the three treatment groups).

The primary outcome measure was migraine days per month. At the start of treatment, this was around 19 days. Placebo reduced this by 2.7 days per month, low dose galcanezumab by 4.8 days and high dose by 4.6 days. Therefore, compared to placebo, the drug on average reduced migraine by 2 days per month. There were only about 2 extra non migraine headache days per month on average.

There were many secondary measures. Of note, 4.5% of placebo patients had a 75% reduction in migraine days, and 7% of low dose and 8.8% of high dose patients, while 0.5% of placebo patients had a 100% response, and 0.7% of low dose and 1.3 % of high dose patients (not significantly different).

There was no overall quality of life measure, but there was a migraine related quality of life measure that showed significantly more improvement, about 25% more improvement than placebo. There was a patient global disease severity 7 point scale, where there was a 0.6 point improvement from placebo, and 0.8 for low dose and 0.9 for high dose, only the latter reaching significance.

The side effect profiles were similar between placebo and drug, notably common in both groups! However, there were no concerning side effects, nor indeed any characteristic enough to tend to unblind the patients or investigators.


The Journal Club thought it was strange that the study would exclude the very patients in whom the drug would mainly be used, namely those who had failed >3 conventional treatments. The focus was clearly on maximising benefit as measured by the study. By the same token, patients had to stop any preventatives before the study, even if they were partially beneficial, apart from topiramate and propranolol.

It was furthermore strange that only 15% of the recruited patients were on the two most common treatments for chronic migraine. Had they only been tried on the others, or had they had side effects? In real practice, there are usually at least some marginal benefits from preventatives and patients often remain on them.

It is therefore possible that many patients were treatment naïve as far as preventatives were concerned. This makes the 2 fewer migraine days per month vs placebo (from an initial 19 days per month) an all the more modest magnitude of benefit.

It is difficult to reconcile the cost of the drug with the fact that patients on average will still have 15 migraine days a month. Most patients would not consider this a treatment success, and certainly not such that a patient would happily be discharged from specialist care. In terms of patients having a 75%+ reduction in migraine days, generally the minimum level of meaningful benefit in a pain study, the excess over placebo was only 3-4% of patients.

The lack of a general quality of life measure means that cost benefit analysis cannot be performed. The quality of life measure used was specific for migraine and likely to show much larger differences; a cured migraine sufferer might have a near 0% to 100% swing on this scale, but another individual considering the range from death to total disability to perfect health might assign curing migraine only a swing from 90% to 100%.

A major aspect of migraine care is what happens when treatment is stopped. Patients do not want lifelong medication, let alone lifelong monthly injections. Fortunately we find that after six months of treatment, traditional preventatives can often be withdrawn. Although the study mentioned that there was an open label period and then a wash out period, we do not know any of these results; presumably they are to be held back for another publication. Is there rebound migraine on treatment withdrawal? Any funding body would want to know if the patients would likely need the treatment for 3-6 months or for many years.

As a final point, it was queried whether the definition of migraine is sufficiently specific; perhaps this limits the observed benefit in this and similar studies. Some headaches recorded as migraine may be tension type headache and therefore not responsive to specific anti-migraine treatment. The table below shows the relevant criteria.

ICHD-3 Headache Diagnostic Criteria

Probable Migraine Probable Tension Type Headache Definite Tension Type headache
2+ of: 2+ of: All of:
4-72 hours duration 30 min to 7 days duration 30 min to 7 days duration
2+ of:



Moderate+ severity,

Avoid routine physical activity

2+ of:


Pressing or tightening

Moderate- severity

Not aggravated by routine activity

2+ of:


Pressing or tightening

Moderate- severity

Not aggravated by routine activity

Nausea or

Photo plus phonophobia

No nausea

Not both phono and photophobia

No nausea

Not both phono and photophobia


A headache is diagnosed as a migraine if fits probable migraine and is not a better fit with another headache diagnosis, which presumably means definite rather than probable tension type headache. The severities and durations overlap so they cannot distinguish. One of photophobia or phonophobia overlaps. So a unilateral, pressing headache with avoidance of routine activity with no nausea no photophobia and no phonophobia  is classified as migraine as long as it lasts 4 hours, but it seemed that some of the migraine days were half an hour of headache. Also a headache not satisfying these criteria is a migraine if there is a response to triptans, but we have seen the large placebo response already from the main data. In general practice a tension type headache might be unilateral, and might interfere with routine activity if at the more severe end of the scale; certainly a neck ache or jaw (including temporalis muscle) ache from which a tension headache may arise may have these features.

The paper on which this Journal Club article is based was presented by Dr Piriyankan Ananthavarathan, Specialist Registrar in Neurology at Barking, Havering and Redbridge University Hospitals Trust.

Posted in Migraine | Tagged , , | Leave a comment

Disease Modifying Therapies in Multiple Sclerosis: Background for General Readers

Multiple sclerosis (MS) is a presumed autoimmune condition of demyelination and often inflammation of the central nervous system. Its evolution is very variable; some patients suffer episodes lasting weeks to months with complete or near complete recovery in between, and the periods between episodes may span months to decades (relapsing remitting MS). Other patients accumulate progressive disability as a result of or between episodes (secondary progressive MS). Still other patients, around 10% in total, do not suffer episodes but instead undergo a gradually progressive course with variable rapidity, but usually noticeable over the course of months to years (primary progressive MS). Patients with MS can evolve from one category to another; some in fact at a certain point remain clinically stable indefinitely.

For many decades, its immune basis has prompted trials of various immunomodulatory agents to try and reverse or at least arrest the progression of multiple sclerosis. Some have been shown not to work, e.g corticosteroids, immunoglobulin. Some work but have largely been overtaken by newer, more expensive, therapies. For example, azathioprine is a traditional commonly used immunosuppressant and in a Cochrane review was found to reduce relapses by around 20% each year for three years of therapy, and to reduce disease progression in secondary progressive disease by 44% (though with wide confidence intervals of 7-64%). There were the expected side effects but no increased risk of malignancy. However it remains possible that there could be a cumulative risk of malignancy for treatment durations above ten years. In the 1990s, beta-interferon became widely used but was never compared directly with azathioprine. With the 21st century came the introduction of “biological therapies”, typically monoclonal antibodies against specific immune system antigen targets. There has also been a reintroduction of non-biological therapies originally used to treat haematological malignancy or to prevent organ transplant rejection.

These new therapies, called disease modifying therapies, as opposed to symptomatic treatments or short courses of steroids for relapses, are now conceptually, though not biochemically or mechanistically, divided into two groups: those better tolerated or with fewer risks of causing malignancy or infections but less effective, and those with more risk of cancer and serious infection, including reactivation of the JC virus to cause fatal progressive multifocal leukoencephalopathy, but with greater efficacy.

The former group includes beta-interferons, glatirimer acetate and fingolimod. Fingolimod is an agent derived, like ciclosporin, from fungal toxins that parasitise insects and has the convenience of oral administration, but is now not routinely recommended because of severe relapses on withdrawal, and cardiac and infection risks.  The latter group includes the biological agents natalizumab (which targets a cell adhesion molecule on lymphocytes), rituximab and ocrelizumab (which target CD20 to deplete B-cells) and alemtuzimab (which targets CD52 expressed on more mature B and T cells) and the oral non-biological anti-tumour agent cladribine which blocks deoxycytidine kinase and thus interferes with DNA synthesis. Another  non biological oral agent, dimethyl fumarate, acts as an immunomodulatory rather than immunosuppressive agent and sits somewhere between the two groups, having oral administration convenience and better efficacy than the first group, but also possessing the increased PML and Fanconi renal syndrome risk of the second group.

Recent studies indicate that higher strength DMTs may slow disability progression in secondary progressive MS, as well as reduce the number of relapses. There have also been trials in primary progressive MS but these, most notably using rituximab, were not clearly positive. For a study looking at ocrelizumab on primary progressive MS, see the accompanying Journal Club review.


Cost of Disease Modifying Therapies

The disease modifying therapies are extremely expensive and, given MS is unfortunately not a rare disease, have a significant impact upon the health economy.

For example, in relation to the accompanying paper review of ocrelizumab for primary progressive MS, this drug is not really expensive compared to similar medications, having a list price of £4790 per 300 mg vial, with four infusions a year. There are many further costs associated with imaging, screening, monitoring and admission for infusions.

Normally, cost effectiveness is justified at around £35,000 per Quality of Life Adjusted Year (QUALY). This means the cost would be justified at £35,000 a year if each year it gave patients 100% quality of life who would otherwise die or have zero quality of life. Clearly ocrelizumab does not do that; it appears to preserve at least 0.5 or 1 out of 10 on a disability scale in 6% of patients on an ongoing basis, giving a quality of life per patient benefit of very roughly 0.6% and a QUALY estimate of over £3 million. Of course, there are other considerations such as wider health economy costs of disability, the fact that some patients might have been prevented from deteriorating by more than 1 point on the EDSS, and the potential costs of monitoring for and treating cancer and PML complications in a relatively young patient population even after treatment is stopped. Note that there was actually no significant difference in this study in the SF 36, with both groups remaining surprisingly little changed after about 2 years, which probably fits with the 0.6% mean improvement figure calculated above.

If the NHS, or the health economies of other countries, do not consider a tighter subset of primary progressive patients who might respond better, it is difficult to balance this with other medical, or indeed social care, conditions that require resourcing.

Posted in Inflammatory/ Auto-Immune Diseases, Primer Posts for General Readers | Tagged , | 1 Comment

Ocrelizumab versus Placebo in Primary Progressive Multiple Sclerosis

Recent studies indicate that higher strength disease modifying therapies (DMTs) may slow disability progression in secondary progressive multiple sclerosis (MS), as well as reduce the number of relapses. There have also been trials in primary progressive MS but these, most notably using rituximab, were not clearly positive. For a more general review, please see the post Disease modifying therapies in multiple sclerosis.

The study being reviewed in this post, by Montalban et al., 2019 is on rituximab’s sister compound, ocrelizumab, and targets younger patients with more active disease, which seemed to be a subgroup that might have responded to rituximab.

Study Design

There were 732 patients randomly assigned to ocrelizumab or placebo in a 2:1 ratio. Inclusion criteria were a diagnosis of primary progressive MS according to established criteria and age 18 to 55 years. Their disability had to range from moderate disability but still no walking impairment to impaired walking but able to walk 20m, perhaps with crutches (EDSS 3.0 to 6.5). The disease duration had to be within 10-15 years. They should never have had any relapses.

Pairs of ocrelizumab or placebo infusions were given every 24 weeks for at least five courses. The main end point was the % of patients with disability progression, defined as at least 1 point on the EDSS scale sustained for 12 weeks, or 0.5 points at the more disabled end of the scale.

Only if this primary end point was reached would the study be continued to test secondary end points such as 24 week sustained disability progression, timed walk at week 120, change in volume of MRI brain lesions, and change in quality of life on the SF36 score.


Patients had a mean disease duration of around 6 years, and 3% more patients having ocrelizumab had gadolinium enhancing lesions on MRI (27% versus 24%).

39.3% of placebo patients had increased disability sustained for a period of 12 weeks, and only 32.9% of ocrelizumab patients (p=0.03, relative risk reduction 24%). This was similar when confirming sustained disability over 24 weeks.

On the timed walk, there was a mean 39% slower performance after 120 weeks in patients on ocrelizumab and 55% slower in patients on placebo (p=0.04). There was no difference in quality of life (SF36 – physical component; a 0.7 out of 100 deterioration on ocrelizumab and 1.1 out of 100 on placebo).

There were three potentially relevant deaths in the ocrelizumab group (out of 486 patients), two from pneumonia and one from cancer, and none in the placebo group, but the overall rate of serious infections was not really different. Cancer rate was 2.3 % versus 0.8%, but obviously this would have to be monitored over further decades. Even during one year of open label extension there were two further cancers in the ocrelizumab group. The overall rate of neoplasms to date is 0.4% per 100 patient years, double the baseline rate, but this reflects a short time in a large number of patients.

In summary, a modest reduction in disability was seen on ocrelizumab, namely preserving against 0.5 to 1 point loss on the EDSS scale in 6 % of patients.



We focused mainly on the figure (see below) where it seems that ocrelizumab stopped about 5% of patients deteriorating in the first 12 to 24 weeks, from about 9% down to 4%, and then this difference was maintained throughout until the end of the trial where about 60% of patients still had not deteriorated. The plateau at 3-4 years is probably because of the end of the trial (see below), not a stable MS population.


The journal club were surprised at the focus on a 12 week primary end point. Patients would have progressed from zero to 3-6 out of 10 on the EDSS scale over a mean period of 6 years, yet they were measuring progression of 0.5 to 1 point over just three months. This is because there was some confusion over the phrase in the paper describing the primary end point as “percentage of patients with disability progression confirmed at 12 weeks”, and then in the results “percentage of patients with 12-week confirmed disability progression (primary end point) was 32.9% with ocrelizumab versus 39.3% with placebo.” It might seem that the primary end point was recorded at 12 weeks following treatment initiation. In fact the primary end point was recorded at the end of the study stopped after over 2 years when a prior defined proportion of patients had deteriorated. It means that over 2+ years, 32.9% of patients had a deterioration that was sustained over at least 12 weeks, i.e. not a relapse.

On the graph, it shows the numbers of patients remaining without disability at different times, starting at 487 and dropping to 462 at 12 weeks for ocrelizumab, which is 5.1% of patients and 244 to 232 for placebo which is 4.9%. Then at 24 weeks, this was 7.6% versus 13.1%. Some of the dropouts might be due to stopping from tolerability, but this was a small amount, possibly accounting from the small numbers of drop-outs between assessments every 12 weeks. For a 12 week confirmed disability progression, clearly there will be a lag in identifying patients whose increase in disability is sustained for 12 weeks. It seems that the time points do not add this 12 weeks because there is a first jump at 12 weeks in both groups. However, these numbers drop down to zero, not to the 60% of patients that appear not to have dropped out. This is likely to be because of patients dropping out because they started the study later and the study was terminated for them before 216 weeks. Nevertheless, factors such as drop outs due to tolerability and end of study probably explain the difference between the figures in the results and the plateau levels on the graphs.

What is interesting is that the difference between ocrelizumab and placebo diverged very early on the graph, and not really further over 2 years. While the 12-week sustained disability was designed to eliminate the possibility that the study is scoring relapses in previously primary progressive disease, or some other temporary factor such as injury from a fall or intercurrent infection, there is nevertheless a suspicion that ocrelizumab was mainly working well on a small subset with more active disease. The extra 3% with gadolinium enhanced lesions – a proportional difference of about 12% – unfortunately suggests a potential issue with randomisation; this might precisely be the group who could respond better.

It is noteworthy therefore that in its most recent NICE appraisal, the criteria for considering ocrelizumab are not those in this study, but a subset of primary progressive patients with enhancing disease on MRI imaging.

The journal club article described in this post was kindly presented by Dr Bina Patel, Specialist Registrar in Neurology.

Posted in Inflammatory/ Auto-Immune Diseases | Tagged , , , , | 1 Comment

Detection of Brain Activation in Vegetative State by Standard Electroencephalography

EEG title pageThis paper by Claassen et al., 2019 looks at EEG pattern changes in response to verbally given movement commands to see if there is a subset of vegetative state patients who are cognitively responsive and yet who have no motor response. The hope is that this might predict eventual outcome.

The study took 104 patients who had had acute brain injury. Most (85%) had non traumatic brain injury, which in general carries a more predictably bad prognosis. These patients were either in a vegetative state or in a somewhat better minimally responsive state, e.g. localising to pain but not obeying commands.

The EEG testing was performed within a few days of initial ITU referral.

In a trial, a patient was asked eight times to open and close their hand repeatedly for 10 seconds and then relax their hand for 10 seconds while recording ongoing EEG activity. Two second time blocks were analysed in the frequency domain by calculating the power spectral density (PSD), looking at the relative strength of signal in each EEG lead in four different frequency ranges (delta, theta, alpha and beta).

A “machine learning algorithm” was used to distinguish the “move” PSDs from the “stop moving” PSDs.

Patients were considered to show EEG activation if the algorithm consistently showed a significantly greater than chance (p=0.5) level of ability to distinguish moving command to stop moving command.

Outcome was determined by the standard Glasgow Outcome Scale after 12 months, with values >=4 (being able to be left up to 8 hours alone) defined as a good outcome.

Ultimately, patients who had at least one record showing EEG activation had a 44% chance of good outcome as defined above and only 14% of patients without EEG activation had a good outcome (with 5% missing data).


Some of the patients were under some sedation for safety reasons, which could influence their responsive in a more reversible manner unrelated to their brain injury and also affect their EEG, although this would be unlikely to affect the change in pattern of EEG over several seconds, other than through the patient’s genuine response level.

It might have been worthwhile to record surface EMG of the forearm flexors, just to confirm there was no difference in EMG activity between “EEG activation” patients and those with no EEG change. In a patient with critical illness neuromyopathy, a little movement or muscle activation might not easily be seen.

Because patients were just taken consecutively, rather than being matched according to their coma severity, there could be poor matching and this was indeed present, where the patients who were subsequently found to be “EEG responsive”, and eventually to have a better outcome, were less likely to be in the worst comatose category at initial enrollment (50% vs 55%) and more likely to be in the best category (31% versus 23%). Although the odds ratios were not statistically significant, this does not mean that with any degree of confidence there was positive evidence for no difference in initial severity between the groups.

In fact, if one stratified patients according to the initial three clinical severity categories, would that have more powerfully predicted better outcome than “EEG responsive” or not, making the test redundant?

On technical appraisal of the methodology, it seems that the power spectral densities were individual 2-second blocks, with all the comparisons and averaging being done subsequently by the machine learning pattern recognition algorithm.

Statistically, the paper used the single value of the area under the curve (AOC) of the receiver operating characteristic (see below). This means that across a range of sensitivities (or true positives (TP), where the algorithm correctly decides that there is enough of a difference between the “move” and “stop moving” patterns), there is an opposing range of false positives (FP). How convex is the curve that describes this range relates to how good the test is. A value of 1 means perfect classification, 0.5 is just random (the straight diagonal in the figure below), and 0 means the pattern change is actually reliably identifying the stop pattern when it was supposed to identify the move pattern.

ROC curves - Receiver operating characteristic - Wikipedia

This is shown in their fig. 3 (below), which seems to show the AOC values for each of the 5 “move” 2-second samples (hence the varying level across each peak and trough) followed by each of the 5 “stop moving” samples, with the whole thing repeated 8 times. However, they say that the graph is shown “for descriptive purposes only” so we do not know how it relates to the real data! We do not know if these are actual averages for all the controls, all the EEG responsive patients (which they call cognitive motor dissociation (CMD)) and all the non EEG responsive patients. If they are averages, they would have to be across all the first 2-second epochs and then all the second 2-second epochs, etc.

EEG pic

Where this is important is that although the algorithm provides a discrete yes-no answer, the confidence of this answer is a continuous variable, and there is a suspicion that this confidence level may fall into a continuous range with healthy volunteers at one end and the most unresponsive EEG patient at the other, rather than there being three discrete modal peaks of normal, EEG responsive and EEG unresponsive. If the former, the inevitable variability about a single mode makes the test far less useful as a predictor of outcome in individual patients. At best, it could be an independent predictor that, combined with other predictors, could build up a reasonably confident prognosis.

A major issue with patients in a vegetative state is when to withdraw support. In the UK, in patients with non traumatic acute brain injury, persistent vegetative state is defined as such around 3 months after injury and this is the time when conversations may be had along these lines on the basis that if the patient has not “woken” by this time, the chance they may eventually do so, with a reasonable quality of life, becomes remotely slim. No-one is ever going to think about withdrawing support at 6 days post-injury on the basis of an “EEG unresponsive” result.

This Journal Club post was presented by Dr Rubika Balendra, Specialist Registrar in Neurology at Barking Havering and Redbridge University Hospitals NHS Trust.


Posted in Intensive Care Neurology | Tagged , , | Leave a comment

Double-Blind Double-Dummy Randomised Study of Continuous Intrajejunal infusion of Levodopa-Carbidopa Intestinal Gel in Advanced Parkinson’s Disease

duodopa olanowBackground

Levodopa, a pro-drug of dopamine, has been used successfully to treat symptoms of Parkinson’s disease for fifty years and remains the mainstay of medical management. However after years of treatment, with increasing loss of dopaminergic presynaptic terminals, symptomatic control may become more brittle, with sudden and unpredictable “on” and “off” treatment times during the day, or with involuntary movements called dyskinesia. There are theoretical reasons, and some animal model and clinical evidence, why intermittent oral delivery of  levodopa may increase susceptibility to these problems through unphysiological wide fluctuations in synaptic dopamine; unfortunately the plasma half life of levodopa after an oral dose is as little as an hour. As a result, other long acting medicines have been introduced, but they may come with other side effects and are simply not as powerful as levodopa.

Relatively steady state levels of levodopa can be achieved by direct intra jejunal delivery. Unfortunately, levodopa is not stable in solution and the gel used to keep levodopa in suspension in a form that can be delivered is very expensive to produce. A year’s treatment in the UK was estimated by NHS England in 2015 to cost around £28000. As a result, despite there being now substantial evidence of the treatment’s effectiveness, there has been a debate about the treatment’s cost effectiveness. Calculations of the cost effectiveness in terms of cost per quality of life adjusted years (QALY) gained vary considerably. The calculations depend not only on the cost of treatment versus standard treatment and the difference in quality of life, but also the carer costs and other costs. So if a treatment is less effective, the patient may be more disabled and cost more. It is unclear, however, how figures on cost of disability can be applied to an estimate of how less effective the treatment is at all points of the severity scale. As far as I am aware there is no actual study showing how much is saved in non medication costs in patients on levodopa-carbidopa intestinal gel (LCIG); the information is instead extrapolated.

In one sense, the QALY gain might be counted twice; once for the intrinsic value of the gain in quality of life, and again for the reduction in disability that resulted in the improved quality of life. In another, this might be a fair way to handle such analysis compared to a treatment that improved quality of life without reducing disability cost.

It is important in such calculations to use reliable data on the magnitude of benefit gained, rather than just to show that there is a gain. This is likely to be achieved by a randomised controlled study with a control arm and is exemplified by the study of Olanow et. al., the subject of this journal club.

Study Design

Sixty six of sixty eight candidate patients underwent the trial. Patients were selected on the basis of having IPD for five or more years, having optimised therapy (meaning a trial of levodopa, a dopamine agonist and one other type of anti-parkinsonian therapy), at least three hours of “off” daily, and no clinically significant psychiatric abnormalities.

At first, assumed that the trial was a cross-over design; in fact it was not. Patients all had jejunostomy procedures but were randomised to LCIG plus placebo oral levodopa, or placebo LCIG plus oral levodopa. They were assessed after a four week stabilisation period before intervention, and then 12 weeks afterwards. Then the two groups were compared.

Patients who were on CR preparations or COMT inhibitors were switched to equivalent immediate release preparations. The LCIG dose was the same as the total daily levodopa dose, delivered over 16 hours of the waking day in the normal fashion for jejunal delivery.

Study Findings

On looking at the graph, labelled figure 2B in the MS, it is immediately obvious that both LCIG patients and oral patients improved very dramatically and then leveled off, despite previously being “optimised” on oral therapy. Our possible suspicions about what “optimised” means are confirmed. As explained by the authors, the doctors had the opportunity to increase the LCIG or oral levodopa during the study, and this was done in a number of cases after the 4 week stabilisation period. In fact the oral medication patients had their medication dose increased more (a mean of 250 mg daily versus 100 mg daily). Despite this, neither group had an increased on time with troublesome dyskinesia.

duodopa olanow2

The main message of the study is that after the 12 weeks, the improvement was greater with LCIG, with a mean of around 1.9 less “off” time and 1.8 hours more “on” time without troublesome dyskinesia. I suppose if there is no change in “on” time with dyskinesia, it is obvious that the two values will be similar as one state is replaced by the other.

Regarding quality of life, there was an 11 point versus 4 point improvement in PDQ-39 (a PD quality of life measure. This seems quite important.

Strangely, on the UPDRS there was an improvement in part II (activities of daily living) on LCIG and a worsening on oral, but actually twice as much improvement in part III (motor examination measured in the on state) on oral therapy. Possibly this means that there a subtle side effect of oral therapy, increased during the trial, that adversely affects wellbeing, but the increased “hit” of levodopa made their best on state better than with LCIG.


It is not clear how the withdrawal of COMT inhibitors made patients in either treatment arm suboptimally treated  and therefore needing increased treatment during the study. It would be important to ascertain if by chance the oral arm had had more COMT inhibitors withdrawn.

The main advantage of this study is that having the control arm at least allows us to appreciate that optimised does not really mean optimised. The patients were clearly underdosed; one has to wonder how much better the oral patients could have been if there was the opportunity to optimise them properly by adjusting top up dopamine agonists, adjusting the frequency rather than just the dose quantities and by introducing, reintroducing or optimising COMT inhibition. After all, studies on COMT inhibitors show reduction in on time by about an hour compared to baseline “optimised” therapy.

A parsimonious interpretation of the data is that LCIG simply has better bioavailability than oral; the patients were underdosed and switching to LCIG Is simply stronger and could be replicated by giving more oral treatment. In fact this may well have been the case, explaining the 150 mg more levodopa per day given to oral patients, but the facility for being able to change doses meant its effect would be minimised in this study.

While the power of the study was easily enough to demonstrate a clinically meaningful difference, I wonder if a cross-over design might have allowed intra-patient comparisons and a more clear effect, and eliminated or elucidated the improvement effect from oral therapy. In this design, each patient would have placebo LCIG for half the time, and placebo oral for the other half. The direction of change at the cross over point would be the key parameter. The patients’ doses would be matched at this cross over point, and then not changed over the second half. This design would be confounded by a bioavailability effect, but at least could be measured by the increase in oral dosing during the first half, and there might be an overdose effect of switching from oral to LCIG during the second half of the trial.

Studies looking at the cost effectiveness of LCIG should primarily take data from those like this one, rather than those that use an open label design showing an improvement compared to baseline “optimised” therapy of four hours “off” time reduction. The increased benefit in PDQ shown in this study is nevertheless quite persuasive that there is some real helpful feature of continuous intrajejunal delivery, at least in the short term.

There are other studies that show long term benefits of LCIG but they have not had the same design. Obviously, this design conducted over too long a period would not be ethical; presumably the principle is that all patients after 12 weeks would be offered LCIG, having already had their PEJ tubes inserted. On the other hand, in a longer term study, one would hope that every ongoing effort would be made to optimise therapy in the oral therapy group.

In practice, one must balance benefit versus side effects. Not all patients will want a PEJ tube, or to carry a large cartridge and pump. Virtually all patients had side effects, more serious ones in 13-20%. In 3% the treatment was discontinued as a result of surgical complications, 24% had tube dislocations, 21% insertion complications, 10% stoma complications, 8% pump malfunctions and 7% peritoneal problems. There are reports of neuropathy from LCIG but in this study there were three possible cases in the placebo group and only one in the treatment group.

Finally, LCIG is not the only advanced therapy available. There are no direct comparisons between LCGIG and deep brain stimulation or apomorphine pump therapy to guide as to which treatment to select in individual patients, although the different inclusion and exclusion criteria do provide some help in choosing which therapy is appropriate for which patient. For example, age over 70 and history of depression exclude deep brain stimulation but not LCIG.

Posted in Parkinson's Disease | Tagged , , , | Leave a comment

Mechanical Thrombectomy for Ischaemic Stroke



Thrombectomy ReviewStroke is the most common cause of disability in Western Countries, and its lifetime risk is 1 in 6 for men and 1 in 5 for women. While managing acute stroke patients in hyperacute stroke units overall has modest benefits for short and long term outcome (e.g. 51% versus 47% independence and 29% versus 33% mortality), specific therapeutic options are limited. The first major option for treatment of ischaemic stroke was intravenous thrombolysis, paralleling its previous development in acute myocardial infarction.

However, while use in myocardial infarction was widespread in the 1990’s, it has only been widely used to treat acute stroke in the last ten years. This is probably because of the narrower therapeutic window and the more severe consequences of haemorrhagic complications in the brain. In addition, its benefits are actually relatively modest. In the first main randomised clinical trial on its use within three hours (NINDS), bearing in mind that in the first hour a stroke often spontaneously recovers – termed a TIA, good outcome (grades 0 to 1 on the Modified Rankin scale) were achieved in 39% versus 26% of patients receiving placebo, but with a symptomatic brain haemorrhage risk 6% greater than in the placebo group.

When delivered between 3 and 4.5 hours after stroke onset (ECASS III), the benefits on the same scale were 52% vs 45%, which gave a relative risk confidence interval range of 1.01 to 1.34 (p=0.04). In other words, this was only just statistically significant in a study of 821 patients. The risk of causing intracranial haemorrhage was 27% versus 17.6% (p=0.001). Thrombolysis caused major symptomatic brain haemorrhage in 2.4% versus 0.3% of placebo patients (p=0.008).

So it is not surprising that there has been a move, just like in cardiology a decade or two earlier, away from relying solely on intravenous thrombolysis and towards direct intra-arterial catheter treatment. The paper, Revolution in acute ischaemic stroke care: a practical guide to mechanical thrombectomy, summarises recent evidence in favour of this treatment and the infrastructure required to manage patients in this way. This Journal Club review discusses issues around acute stroke treatment and the ramifications for delivery of such a service.


The Published Review

The first mechanical thrombectomy devices were approved for use in 2004, but it was only technical developments, and probably the improved expertise that comes with experience, that led to positive results as shown by a spate of studies published after 2010 employing a new generation of devices.

The HERMES collaboration meta-analysis revealed that 46% of patients had a good outcome with functional independence (grades 0-2 on the Modified Rankin scale) compared with 26.5% on best medical treatment. Most of the patients in both groups received intravenous (iv) thrombolysis, since in most study protocols patients had iv thrombolysis before going on to have thrombectomy an hour or so later. Mortality and the risk of brain haemorrhage did not differ between the two groups. The benefit seemed still to be present in patients over 80, and when patients did not receive iv thrombolysis, though the numbers to test the latter were small. While the window for thrombectomy was within 6 hours, there may still be improved outcomes up to 7.3 hours after symptom onset, but in general faster intervention leads to greater benefit. At a Quality Adjusted Life Years (QALY) cost of £2500, the procedure would be considered by any political criteria to be cost-effective.

The Thrombectomy technique has a number of variations depending on the Neuroradiologist and on the particular nature and location of the thrombus. It may be done under general anaesthesia or local anaesthesia with sedation and anaesthetic support. A large gauge catheter is directed to the internal carotid via a femoral puncture, and an intermediate catheter inside it is directed to the Circle of Willis. Then a microcatheter inside the intermediate one serves as a guide wire to the actual clot. The microcatheter is then removed and a stent retriever is placed within the clot, and pulled back to draw the clot to the intermediate catheter. Suction is applied to this catheter to remove the clot entirely. Some techniques involve directly removing the clot by suction on the intermediate catheter. A balloon may be located on the distal end of the clot to prevent forward movement (a clinician would describe this as embolus, an undesirable occurrence). When removing the clot reveals a tight lumen, there is the further option to perform angioplasty or stenting to open the vessel. The same can apply to a carotid stenosis occurring in tandem with a more distal thrombus.

The main complications are technical, including vessel perforation (1.6%), other symptomatic intracranial haemorrhage (3-9%), subarachnoid haemorrhage (0.6 – 5%), arterial dissection (0.6 to 3.9%), or emboli distally (1-9%). In addition , there can be vasospasm or issues related to the puncture site. While the total incidence is 15%, not always is there any actual clinical adverse consequence.

While the 6 hour time window for thrombectomy is wider than for intravenous treatment, there are other selection criteria that are more strict:

  • There should be a documented anterior circulation large vessel occlusion of the middle cerebral or carotid artery. (There is only limited evidence for efficacy in basilar occlusion.)
  • There should be good collateral cerebral circulation.
  • There should be relatively normal extracranial arterial anatomy from the technical viewpoint regarding passing the catheter.
  • There should be significant clinical deficit at the time of treatment (but this parallels the criteria that should be applied also to intravenous thrombolysis), while acknowledging that a large vessel occlusion with minimal clinical deficit nevertheless incurs a significant risk of clinical deterioration.
  • There should be a lack of extensive early ischaemic change on CT (according to the ASPECTS score a threshold of 5). The role of more advanced imaging, e.g. CT perfusion, to establish salvageable brain, is yet to be clarified.
  • Consideration should be given to pre-stroke functional status and the potential of benefit.
  • Patients should have had iv thrombolysis within 4.5 hours of symptom onset.

The authors report that there is little evidence on managing blood pressure around the time of the procedure. It is probably best to avoid lowering blood pressure unless it is greater than 220 mmHg systolic, or 200 mmHg systolic if evidence of clinical complications of hypertension.

Usually no specific anticoagulation is given around the procedure. Some interventionalists use a peri-procedure dose of heparin. Aspirin is avoided beforehand but patients can have their usual 300 mg aspirin dose starting 24 hours after their stroke. If a stent has been implanted, aspirin and clopidogrel are given together for the first 3-6 months.

Authors’ Conclusions

The authors emphasise the great benefits to be had in selected patients, and comment that the selection criteria may be broadened with future experience. In particular, cases of milder stroke with large vessel occlusions may prove to be good candidates or the time window may broaden and perhaps ignored altogether if advanced imaging reveals a reversible penumbra.

They highlight that the significant technical complication rate means that the procedure should be concentrated in centres that deal with a large number of cases to gain and maintain expertise. They describe two models: “drip and ship” where the patient is thrombolysed at a local HASU (or A&E resuscitation unit?) and ambulanced to the thrombectomy centre, versus “mothership”, where the patient is transferred straight to the thrombectomy unit.

Journal Club Comments

The 20% increased good outcome arising from mechanical thrombectomy on top of that from iv thrombolysis is impressive compared to the 13% reported for thrombolysis versus placebo.

While the selection criteria are more stringent, they are not very much more stringent than for thrombolysis alone; a middle cerebral artery occlusion is a common presentation of acute stoke, especially if it is more severe. The review estimated that 10% of acute stroke patients would be candidates. We suspect at most half that amount, given that in practice thrombolysis rates are 10%, and 5% in some centres.

The most striking issues for us were the very high degree of technical expertise required acutely for decision-making and performing the procedure, and the high technical complication rates that parallel the high levels of benefit. The Neuroradiologist appears to decide both before and during the procedure between a number of different technical options and items of equipment. The suspicion is that the complications, unlike the haemorrhage rates for iv thrombolysis, depend much less on blind luck than on user expertise.

We wondered about circumstances where there might be a contraindication to intravenous thrombolysis and yet not to thrombectomy; it does not appear that thrombolysis, or even antocoagulation or antiplatelet therapy, is actually required for the procedure, and intravenous thrombolyis is so short acting that it would not be protecting against new emboli resulting from the procedure. The trials were conducted according to a protocol of having received thrombolysis mainly for ethical reasons around not denying patients proven beneficial treatment.

However, for practical purposes, a poor candidate for thrombolysis is probably in general going to be a poor candidate for thrombectomy. It would nevertheless be interesting to see if the 20% benefit from thrombectomy overlaps with that from thrombolysis, or adds to it. In other words, could patients get a 20% benefit from thrombectomy alone, and not face the 6% risk of thrombolysis-induced brain haemorrhage?

As an aside to the discussion on benefits of stroke treatment, we noted the different slants that can be put on data. This has great practical consequences for the patient. So, returning for a moment to intravenous thrombolysis, at 3 to 4.5 hours after stroke, a clinician may explain to a patient (if they are not too dysphasic at the time), that they can deliver a treatment with an odds ratio of good outcome of 1.34. Or the clinician might more likely say there would be 34% better chance of recovery, or a third as much again better chance of recovery. Right?

Wrong! The odds ratio is the ratio of good versus bad outcome in the treated group over the ratio of good to bad outcome in the untreated group. What layperson would describe things in those terms – terms that deliberately magnify the benefit? The relative risk, i.e. the ratio of a good outcome in the treated group versus that in the untreated group, is what most laypeople would understand, and the figure is 1.16. Even then, this does not mean that 16% more patients have a good outcome. From the actual figures, 52% versus 45%, 7% more patients get a good outcome, which is considerably different from 34%, and not so favourable when at the same time there are 10% more patients getting brain haemorrhages (or should we say 53.4% more likely?!), though only 2.5% (700% more likely!!) of these haemorrhages are giving them a much bigger stroke than they otherwise would have had.

What I would say at 3 to 4.5 hours after stroke onset is:

“We have a treatment available to dissolve clots in the brain that when given at this time after a stroke probably overall improves the chances of a good recovery, but which has risks of causing bleeding, including a brain haemorrhage that may make your stroke worse not better. Overall out of 100 people, on average 7 extra patients will get a good recovery from their stroke when they have the treatment, about 90 will be no different and 3 will be significantly worsened.”

And if the stroke is relatively mild, or one of those where one suspects the patient might be significantly better come the following morning regardless, one really wonders how much the patient stands to gain and whether to take that 2.5% risk of a much worse stroke instead.

The point about dysphasia is a serious one; can one ethically obtain proper consent to deliver a treatment that is definitely going to result in some people suffering additional permanent disability if not death? Even without dysphasia, lying semi-paralysed under a ticking clock is probably a situation, both for the patient and relatives, where choice, let alone informed consent, is an illusion. When consenting for emergency surgery, one generally has at least the impression that the benefits are an order of magnitude greater than the risks, or that a poor outcome without intervention is inevitable.

Another example of statistics and the all-important magical 0.05 p-value relates to the original comment about acute stroke units. The differences from general ward care are surprisingly modest, but it is always quoted from the Stroke Unit Trialists’ Collaboration Cochrane review in 2009 that stroke significantly reduces mortality. A group, Sun et al., (2013) did their own analysis and actually looked at the data. There was a discrepancy in the number of deaths in the control group in the largest study, the Athens trial: 121 deaths versus 127. On contacting the Cochrane review author, they were told that there was an “error which will be corrected in the next update”; on doing the sums to correct the “error”, Sun found that the p value for significant reduction in mortality shifted across the magical 0.05 threshold from 0.03 to 0.06. So there is no clear evidence that stroke units reduce mortality…

If one looks objectively at the data:

  • Thrombectomy leads to 20% more good outcomes, which may replace rather that add to that from intravenous thrombolysis and with no higher risk of brain haemorrhage.
  • Thrombolysis alone leads to 13% more good outcomes, if given within a very restricted window of 3 hours after stroke onset, but with a significant risk of brain haemorrhage and other complications.
  • Stroke units, which also treat the other 90% of strokes, lead to 4% better outcomes, a figure of uncertain clinical significance.

Regarding stroke units, it is possible that it is the 10% who are candidates for intervention that are contributing largely to that 4% improvement, along with those with haemorrhagic stroke getting surgical input or neurological stroke mimics getting fast-tracked to more appropriate acute care. And if general wards treating the other 90% had more focus on early swallow assessments and actually feeding nil-by-mouth patients nasogastrically within 48 hours, would that single measure not improve outcome?

The initial decision to perform thrombectomy is highly technical and requires a neurointerventional radiologist, the procedure obviously requires a neuroradiologist, and therefore the consent should probably be taken by the neuroradiologist, as well as a the post-procedure ward round and early outpatient follow-up. The neuroradiologist requires the support of an anaesthetist during the procedure, and perhaps around the procedure as an intensivist. The technical skill required to write a thrombolysis prescription is negligible; that to perform a highly challenging emergency procedure, to minimise technical complications arising from mistakes and to deal with those complications when they do arise, will make or break the success of thrombectomy and the success of the stroke service. Does it not seem that acute stroke care has shifted from a medical to a “surgical” speciality? Instead of a “mothership”, could we have a Neuroemergency Unit, a Neuro ITU next to a catheter lab, centred around the Neuroradiologist managing the patients with acute stroke patients who are going to benefit from intervention, as well as patients with subarachnoid haemorrhage. They would have support from anaesthetists, stroke physicians/neurologists and neurosurgeons, with stroke physicians and allied health professionals taking on the subsequent rehabilitation role?

Posted in Stroke | Tagged , , , , | Leave a comment