Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 1.6 #11185

Closed
wants to merge 739 commits into from
Closed

Branch 1.6 #11185

wants to merge 739 commits into from

Conversation

JamieZZZ
Copy link

No description provided.

brkyvz and others added 30 commits December 4, 2015 12:09
Python tests require access to the `KinesisTestUtils` file. When this file exists under src/test, python can't access it, since it is not available in the assembly jar.

However, if we move KinesisTestUtils to src/main, we need to add the KinesisProducerLibrary as a dependency. In order to avoid this, I moved KinesisTestUtils to src/main, and extended it with ExtendedKinesisTestUtils which is under src/test that adds support for the KPL.

cc zsxwing tdas

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #10050 from brkyvz/kinesis-py.
…s in SparkR.

Author: Sun Rui <rui.sun@intel.com>

Closes #9804 from sun-rui/SPARK-11774.

(cherry picked from commit c8d0e16)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Need to match existing method signature

Author: felixcheung <felixcheung_m@hotmail.com>

Closes #9680 from felixcheung/rcorr.

(cherry picked from commit 895b6c4)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
… be consistent with Scala/Python

Change ```numPartitions()``` to ```getNumPartitions()``` to be consistent with Scala/Python.
<del>Note: If we can not catch up with 1.6 release, it will be breaking change for 1.7 that we also need to explain in release note.<del>

cc sun-rui felixcheung shivaram

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #10123 from yanboliang/spark-12115.

(cherry picked from commit 6979edf)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
1, Add ```isNaN``` to ```Column``` for SparkR. ```Column``` should has three related variable functions: ```isNaN, isNull, isNotNull```.
2, Replace ```DataFrame.isNaN``` with ```DataFrame.isnan``` at SparkR side. Because ```DataFrame.isNaN``` has been deprecated and will be removed at Spark 2.0.
<del>3, Add ```isnull``` to ```DataFrame``` for SparkR. ```DataFrame``` should has two related functions: ```isnan, isnull```.<del>

cc shivaram sun-rui felixcheung

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #10037 from yanboliang/spark-12044.

(cherry picked from commit b6e8e63)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Author: gcc <spark-src@condor.rhaag.ip>

Closes #10101 from rh99/master.

(cherry picked from commit 04b6799)
Signed-off-by: Sean Owen <sowen@cloudera.com>
When \u appears in a comment block (i.e. in /**/), code gen will break. So, in Expression and CodegenFallback, we escape \u to \\u.

yhuai Please review it. I did reproduce it and it works after the fix. Thanks!

Author: gatorsmile <gatorsmile@gmail.com>

Closes #10155 from gatorsmile/escapeU.

(cherry picked from commit 49efd03)
Signed-off-by: Yin Huai <yhuai@databricks.com>
…y when Jenkins load is high

We need to make sure that the last entry is indeed the last entry in the queue.

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #10110 from brkyvz/batch-wal-test-fix.

(cherry picked from commit 6fd9e70)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
This PR:
1. Suppress all known warnings.
2. Cleanup test cases and fix some errors in test cases.
3. Fix errors in HiveContext related test cases. These test cases are actually not run previously due to a bug of creating TestHiveContext.
4. Support 'testthat' package version 0.11.0 which prefers that test cases be under 'tests/testthat'
5. Make sure the default Hadoop file system is local when running test cases.
6. Turn on warnings into errors.

Author: Sun Rui <rui.sun@intel.com>

Closes #10030 from sun-rui/SPARK-12034.

(cherry picked from commit 39d677c)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Currently, the current line is not cleared by Cltr-C

After this patch
```
>>> asdfasdf^C
Traceback (most recent call last):
  File "~/spark/python/pyspark/context.py", line 225, in signal_handler
    raise KeyboardInterrupt()
KeyboardInterrupt
```

It's still worse than 1.5 (and before).

Author: Davies Liu <davies@databricks.com>

Closes #10134 from davies/fix_cltrc.

(cherry picked from commit ef3f047)
Signed-off-by: Davies Liu <davies.liu@gmail.com>
…ner not present

The reason is that TrackStateRDDs generated by trackStateByKey expect the previous batch's TrackStateRDDs to have a partitioner. However, when recovery from DStream checkpoints, the RDDs recovered from RDD checkpoints do not have a partitioner attached to it. This is because RDD checkpoints do not preserve the partitioner (SPARK-12004).

While #9983 solves SPARK-12004 by preserving the partitioner through RDD checkpoints, there may be a non-zero chance that the saving and recovery fails. To be resilient, this PR repartitions the previous state RDD if the partitioner is not detected.

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #9988 from tdas/SPARK-11932.

(cherry picked from commit 5d80d8c)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
https://issues.apache.org/jira/browse/SPARK-11963

Author: Xusen Yin <yinxusen@gmail.com>

Closes #9962 from yinxusen/SPARK-11963.

(cherry picked from commit 871e85d)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
…cala doc

In SPARK-11946 the API for pivot was changed a bit and got updated doc, the doc changes were not made for the python api though. This PR updates the python doc to be consistent.

Author: Andrew Ray <ray.andrew@gmail.com>

Closes #10176 from aray/sql-pivot-python-doc.

(cherry picked from commit 36282f7)
Signed-off-by: Yin Huai <yhuai@databricks.com>
Switched from using SQLContext constructor to using getOrCreate, mainly in model save/load methods.

This covers all instances in spark.mllib.  There were no uses of the constructor in spark.ml.

CC: mengxr yhuai

Author: Joseph K. Bradley <joseph@databricks.com>

Closes #10161 from jkbradley/mllib-sqlcontext-fix.

(cherry picked from commit 3e7e05f)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…ing include_example

Made new patch contaning only markdown examples moved to exmaple/folder.
Ony three  java code were not shfted since they were contaning compliation error ,these classes are
1)StandardScale 2)NormalizerExample 3)VectorIndexer

Author: Xusen Yin <yinxusen@gmail.com>
Author: somideshmukh <somilde@us.ibm.com>

Closes #10002 from somideshmukh/SomilBranch1.33.

(cherry picked from commit 78209b0)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
Add since annotation to ml.classification

Author: Takahashi Hiroshi <takahashi.hiroshi@lab.ntt.co.jp>

Closes #8534 from taishi-oss/issue10259.

(cherry picked from commit 7d05a62)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…mple code

Add ```SQLTransformer``` user guide, example code and make Scala API doc more clear.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #10006 from yanboliang/spark-11958.

(cherry picked from commit 4a39b5a)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…means Value

Author: cody koeninger <cody@koeninger.org>

Closes #10132 from koeninger/SPARK-12103.

(cherry picked from commit 48a9804)
Signed-off-by: Sean Owen <sowen@cloudera.com>
Author: Jeff Zhang <zjffdu@apache.org>

Closes #10172 from zjffdu/SPARK-12166.

(cherry picked from commit 7081291)
Signed-off-by: Sean Owen <sowen@cloudera.com>
This reverts PR #10002, commit 78209b0.

The original PR wasn't tested on Jenkins before being merged.

Author: Cheng Lian <lian@databricks.com>

Closes #10200 from liancheng/revert-pr-10002.

(cherry picked from commit da2012a)
Signed-off-by: Cheng Lian <lian@databricks.com>
Fix commons-collection group ID to commons-collections for version 3.x

Patches earlier PR at #9731

Author: Sean Owen <sowen@cloudera.com>

Closes #10198 from srowen/SPARK-11652.2.

(cherry picked from commit e3735ce)
Signed-off-by: Sean Owen <sowen@cloudera.com>
checked with hive, greatest/least should cast their children to a tightest common type,
i.e. `(int, long) => long`, `(int, string) => error`, `(decimal(10,5), decimal(5, 10)) => error`

Author: Wenchen Fan <wenchen@databricks.com>

Closes #10196 from cloud-fan/type-coercion.

(cherry picked from commit 381f17b)
Signed-off-by: Michael Armbrust <michael@databricks.com>
This PR is to add three more data types into Encoder, including `BigDecimal`, `Date` and `Timestamp`.

marmbrus cloud-fan rxin Could you take a quick look at these three types? Not sure if it can be merged to 1.6. Thank you very much!

Author: gatorsmile <gatorsmile@gmail.com>

Closes #10188 from gatorsmile/dataTypesinEncoder.

(cherry picked from commit c0b13d5)
Signed-off-by: Michael Armbrust <michael@databricks.com>
… APIs

This PR contains the following updates:

- Created a new private variable `boundTEncoder` that can be shared by multiple functions, `RDD`, `select` and `collect`.
- Replaced all the `queryExecution.analyzed` by the function call `logicalPlan`
- A few API comments are using wrong class names (e.g., `DataFrame`) or parameter names (e.g., `n`)
- A few API descriptions are wrong. (e.g., `mapPartitions`)

marmbrus rxin cloud-fan Could you take a look and check if they are appropriate? Thank you!

Author: gatorsmile <gatorsmile@gmail.com>

Closes #10184 from gatorsmile/datasetClean.

(cherry picked from commit 5d96a71)
Signed-off-by: Michael Armbrust <michael@databricks.com>
jira: https://issues.apache.org/jira/browse/SPARK-10393

Since the logic of the text processing part has been moved to ML estimators/transformers, replace the related code in LDA Example with the ML pipeline.

Author: Yuhao Yang <hhbyyh@gmail.com>
Author: yuhaoyang <yuhao@zhanglipings-iMac.local>

Closes #8551 from hhbyyh/ldaExUpdate.

(cherry picked from commit 872a2ee)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
…unction

Delays application of ResolvePivot until all aggregates are resolved to prevent problems with UnresolvedFunction and adds unit test

Author: Andrew Ray <ray.andrew@gmail.com>

Closes #10202 from aray/sql-pivot-unresolved-function.

(cherry picked from commit 4bcb894)
Signed-off-by: Yin Huai <yhuai@databricks.com>
jira: https://issues.apache.org/jira/browse/SPARK-11605
Check Java compatibility for MLlib for this release.

fix:

1. `StreamingTest.registerStream` needs java friendly interface.

2. `GradientBoostedTreesModel.computeInitialPredictionAndError` and `GradientBoostedTreesModel.updatePredictionError` has java compatibility issue. Mark them as `developerAPI`.

TBD:
[updated] no fix for now per discussion.
`org.apache.spark.mllib.classification.LogisticRegressionModel`
`public scala.Option<java.lang.Object> getThreshold();` has wrong return type for Java invocation.
`SVMModel` has the similar issue.

Yet adding a `scala.Option<java.util.Double> getThreshold()` would result in an overloading error due to the same function signature. And adding a new function with different name seems to be not necessary.

cc jkbradley feynmanliang

Author: Yuhao Yang <hhbyyh@gmail.com>

Closes #10102 from hhbyyh/javaAPI.

(cherry picked from commit 5cb4695)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
Documentation regarding the `IndexToString` label transformer with code snippets in Scala/Java/Python.

Author: BenFradet <benjamin.fradet@gmail.com>

Closes #10166 from BenFradet/SPARK-12159.

(cherry picked from commit 06746b3)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
This patch tightens them to `private[memory]`.

Author: Andrew Or <andrew@databricks.com>

Closes #10182 from andrewor14/memory-visibility.

(cherry picked from commit 9494521)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
Author: Michael Armbrust <michael@databricks.com>

Closes #10060 from marmbrus/docs.

(cherry picked from commit 3959489)
Signed-off-by: Michael Armbrust <michael@databricks.com>
gatorsmile and others added 25 commits February 1, 2016 11:22
JIRA: https://issues.apache.org/jira/browse/SPARK-12989

In the rule `ExtractWindowExpressions`, we simply replace alias by the corresponding attribute. However, this will cause an issue exposed by the following case:

```scala
val data = Seq(("a", "b", "c", 3), ("c", "b", "a", 3)).toDF("A", "B", "C", "num")
  .withColumn("Data", struct("A", "B", "C"))
  .drop("A")
  .drop("B")
  .drop("C")

val winSpec = Window.partitionBy("Data.A", "Data.B").orderBy($"num".desc)
data.select($"*", max("num").over(winSpec) as "max").explain(true)
```
In this case, both `Data.A` and `Data.B` are `alias` in `WindowSpecDefinition`. If we replace these alias expression by their alias names, we are unable to know what they are since they will not be put in `missingExpr` too.

Author: gatorsmile <gatorsmile@gmail.com>
Author: xiaoli <lixiao1983@gmail.com>
Author: Xiao Li <xiaoli@Xiaos-MacBook-Pro.local>

Closes #10963 from gatorsmile/seletStarAfterColDrop.

(cherry picked from commit 33c8a49)
Signed-off-by: Michael Armbrust <michael@databricks.com>
ISTM `lib` is better because `datanucleus` jars are located in `lib` for release builds.

Author: Takeshi YAMAMURO <linguin.m.s@gmail.com>

Closes #10901 from maropu/DocFix.

(cherry picked from commit da9146c)
Signed-off-by: Michael Armbrust <michael@databricks.com>
Changed a target at branch-1.6 from #10635.

Author: Takeshi YAMAMURO <linguin.m.s@gmail.com>

Closes #10915 from maropu/pr9935-v3.
It is not valid to call `toAttribute` on a `NamedExpression` unless we know for sure that the child produced that `NamedExpression`.  The current code worked fine when the grouping expressions were simple, but when they were a derived value this blew up at execution time.

Author: Michael Armbrust <michael@databricks.com>

Closes #11011 from marmbrus/groupByFunction.
Author: Michael Armbrust <michael@databricks.com>

Closes #11014 from marmbrus/seqEncoders.

(cherry picked from commit 29d9218)
Signed-off-by: Michael Armbrust <michael@databricks.com>
…ML python models' properties

Backport of [SPARK-12780] for branch-1.6

Original PR for master: #10724

This fixes StringIndexerModel.labels in pyspark.

Author: Xusen Yin <yinxusen@gmail.com>

Closes #10950 from jkbradley/yinxusen-spark-12780-backport.
I've tried to solve some of the issues mentioned in: https://issues.apache.org/jira/browse/SPARK-12629
Please, let me know what do you think.
Thanks!

Author: Narine Kokhlikyan <narine.kokhlikyan@gmail.com>

Closes #10580 from NarineK/sparkrSavaAsRable.

(cherry picked from commit 8a88e12)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
java mapwithstate with Function3 has wrong conversion of java `Optional` to scala `Option`, fixed code uses same conversion used in the mapwithstate call that uses Function4 as an input. `Optional.fromNullable(v.get)` fails if v is `None`, better to use `JavaUtils.optionToOptional(v)` instead.

Author: Gabriele Nizzoli <mail@nizzoli.net>

Closes #11007 from gabrielenizzoli/branch-1.6.
…lumn name duplication

Fixes problem and verifies fix by test suite.
Also - adds optional parameter: nullable (Boolean) to: SchemaUtils.appendColumn
and deduplicates SchemaUtils.appendColumn functions.

Author: Grzegorz Chilkiewicz <grzegorz.chilkiewicz@codilime.com>

Closes #10741 from grzegorz-chilkiewicz/master.

(cherry picked from commit b1835d7)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
Jira:
https://issues.apache.org/jira/browse/SPARK-13056

Create a map like
{ "a": "somestring", "b": null}
Query like
SELECT col["b"] FROM t1;
NPE would be thrown.

Author: Daoyuan Wang <daoyuan.wang@intel.com>

Closes #10964 from adrian-wang/npewriter.

(cherry picked from commit 358300c)
Signed-off-by: Michael Armbrust <michael@databricks.com>

Conflicts:
	sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
The example will throw error like
<console>:20: error: not found: value StructType

Need to add this line:
import org.apache.spark.sql.types._

Author: Kevin (Sangwoo) Kim <sangwookim.me@gmail.com>

Closes #10141 from swkimme/patch-1.

(cherry picked from commit b377b03)
Signed-off-by: Michael Armbrust <michael@databricks.com>
https://issues.apache.org/jira/browse/SPARK-13122

A race condition can occur in MemoryStore's unrollSafely() method if two threads that
return the same value for currentTaskAttemptId() execute this method concurrently. This
change makes the operation of reading the initial amount of unroll memory used, performing
the unroll, and updating the associated memory maps atomic in order to avoid this race
condition.

Initial proposed fix wraps all of unrollSafely() in a memoryManager.synchronized { } block. A cleaner approach might be introduce a mechanism that synchronizes based on task attempt ID. An alternative option might be to track unroll/pending unroll memory based on block ID rather than task attempt ID.

Author: Adam Budde <budde@amazon.com>

Closes #11012 from budde/master.

(cherry picked from commit ff71261)
Signed-off-by: Andrew Or <andrew@databricks.com>

Conflicts:
	core/src/main/scala/org/apache/spark/storage/MemoryStore.scala
…uration columns

I have clearly prefix the two 'Duration' columns in 'Details of Batch' Streaming tab as 'Output Op Duration' and 'Job Duration'

Author: Mario Briggs <mario.briggs@in.ibm.com>
Author: mariobriggs <mariobriggs@in.ibm.com>

Closes #11022 from mariobriggs/spark-12739.

(cherry picked from commit e9eb248)
Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
…ld not fail analysis of encoder

nullability should only be considered as an optimization rather than part of the type system, so instead of failing analysis for mismatch nullability, we should pass analysis and add runtime null check.

backport #11035 to 1.6

Author: Wenchen Fan <wenchen@databricks.com>

Closes #11042 from cloud-fan/branch-1.6.
minor fix for api link in ml onevsrest

Author: Yuhao Yang <hhbyyh@gmail.com>

Closes #11068 from hhbyyh/onevsrestDoc.

(cherry picked from commit c2c956b)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…ot set but timeoutThreshold is defined

Check the state Existence before calling get.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #11081 from zsxwing/SPARK-13195.

(cherry picked from commit 8e2f296)
Signed-off-by: Shixiong Zhu <shixiong@databricks.com>
Author: Bill Chambers <bill@databricks.com>

Closes #11094 from anabranch/dynamic-docs.

(cherry picked from commit 66e1383)
Signed-off-by: Andrew Or <andrew@databricks.com>
There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.

The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.

And also, we could not grow the array in `insertRecord` of `InMemorySorter` (it was there just for easy testing).

Author: Davies Liu <davies@databricks.com>

Closes #11095 from davies/fix_expand.
…ters with Jackson 2.2.3

Patch to

1. Shade jackson 2.x in spark-yarn-shuffle JAR: core, databind, annotation
2. Use maven antrun to verify the JAR has the renamed classes

Being Maven-based, I don't know if the verification phase kicks in on an SBT/jenkins build. It will on a `mvn install`

Author: Steve Loughran <stevel@hortonworks.com>

Closes #10780 from steveloughran/stevel/patches/SPARK-12807-master-shuffle.

(cherry picked from commit 34d0b70)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
JIRA: https://issues.apache.org/jira/browse/SPARK-10524

Currently we use the hard prediction (`ImpurityCalculator.predict`) to order categories' bins. But we should use the soft prediction.

Author: Liang-Chi Hsieh <viirya@gmail.com>
Author: Liang-Chi Hsieh <viirya@appier.com>
Author: Joseph K. Bradley <joseph@databricks.com>

Closes #8734 from viirya/dt-soft-centroids.

(cherry picked from commit 9267bc6)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
… SpecificParquetRecordReaderBase

This is a minor followup to #10843 to fix one remaining place where we forgot to use reflective access of TaskAttemptContext methods.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #11131 from JoshRosen/SPARK-12921-take-2.
Update Aggregator links to point to #org.apache.spark.sql.expressions.Aggregator

Author: raela <raela@databricks.com>

Closes #11158 from raelawang/master.

(cherry picked from commit 719973b)
Signed-off-by: Reynold Xin <rxin@databricks.com>
…e system besides HDFS

jkbradley I tried to improve the function to export a model. When I tried to export a model to S3 under Spark 1.6, we couldn't do that. So, it should offer S3 besides HDFS. Can you review it when you have time? Thanks!

Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>

Closes #11151 from yu-iskw/SPARK-13265.

(cherry picked from commit efb65e0)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…n error

Pyspark Params class has a method `hasParam(paramName)` which returns `True` if the class has a parameter by that name, but throws an `AttributeError` otherwise. There is not currently a way of getting a Boolean to indicate if a class has a parameter. With Spark 2.0 we could modify the existing behavior of `hasParam` or add an additional method with this functionality.

In Python:
```python
from pyspark.ml.classification import NaiveBayes
nb = NaiveBayes()
print nb.hasParam("smoothing")
print nb.hasParam("notAParam")
```
produces:
> True
> AttributeError: 'NaiveBayes' object has no attribute 'notAParam'

However, in Scala:
```scala
import org.apache.spark.ml.classification.NaiveBayes
val nb  = new NaiveBayes()
nb.hasParam("smoothing")
nb.hasParam("notAParam")
```
produces:
> true
> false

cc holdenk

Author: sethah <seth.hendrickson16@gmail.com>

Closes #10962 from sethah/SPARK-13047.

(cherry picked from commit b354673)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
…alue parameter

Fix this defect by check default value exist or not.

yanboliang Please help to review.

Author: Tommy YU <tummyyu@163.com>

Closes #11043 from Wenpei/spark-13153-handle-param-withnodefaultvalue.

(cherry picked from commit d3e2e20)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@vanzin
Copy link
Contributor

vanzin commented Feb 12, 2016

Hi can you please close this PR?

markpavey and others added 2 commits February 13, 2016 08:39
… Windows

Due to being on a Windows platform I have been unable to run the tests as described in the "Contributing to Spark" instructions. As the change is only to two lines of code in the Web UI, which I have manually built and tested, I am submitting this pull request anyway. I hope this is OK.

Is it worth considering also including this fix in any future 1.5.x releases (if any)?

I confirm this is my own original work and license it to the Spark project under its open source license.

Author: markpavey <mark.pavey@thefilter.com>

Closes #11135 from markpavey/JIRA_SPARK-13142_WindowsWebUILogFix.

(cherry picked from commit 374c4b2)
Signed-off-by: Sean Owen <sowen@cloudera.com>
…ailed test

JIRA: https://issues.apache.org/jira/browse/SPARK-12363

This issue is pointed by yanboliang. When `setRuns` is removed from PowerIterationClustering, one of the tests will be failed. I found that some `dstAttr`s of the normalized graph are not correct values but 0.0. By setting `TripletFields.All` in `mapTriplets` it can work.

Author: Liang-Chi Hsieh <viirya@gmail.com>
Author: Xiangrui Meng <meng@databricks.com>

Closes #10539 from viirya/fix-poweriter.

(cherry picked from commit e3441e3)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@asfgit asfgit closed this in 610196f Feb 14, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.